Genes to Mental Health Network Open Session Meeting
JOSHUA GORDON: Hello and welcome to this investigators meeting of the Genes to Mental Health Program. It's my pleasure as director of the National Institute of Mental Health to welcome you to this first in-person meeting since the beginning of the pandemic. I know that as these awards progress into their fifth year that this meeting takes on increased importance as we attempt to summarize what we've learned, what we've measured, what we've collected, and what we've harmonized, plan final analyses, and discuss the future directions for genetic research into rare diseases.
This Day 1 of the program is open to the public for an important reason. We want to inform the research community of this important public resource that we've been developing. We hope that it can be used for aggregation with existing, other existing studies, to create synthetic cohorts and encourage secondary data analyses, especially since these datasets are really the first to couple deep phenotyping with precise genomics, with the aim of uncovering genotype/phenotype relationships across neuropsychiatric disorders that are due to rare genetic variants.
I've been a champion of this program since its inception. It's tremendously important that we seize the moment and unravel the connection between genetics and neuropsychiatric symptoms if we are to benefit fully from the investments we've made in genetic discovery. This next step along the genes to treatment pathway is really, really crucial, and I'm proud of the teams that you all have put together, I'm proud of the degree to which you've worked together through the course of innumerable challenges over the past five years to amass data and procedures and processes that will keep giving us scientific returns for a long time to come.
So as you gather today in the open meeting and tomorrow as you confer amongst yourselves, I want you to keep this in mind, that the work you are doing over the next two days, the work you've been doing over the last five years, and the work that you will continue to do really is laying the stage for the next generation of treatments for neuropsychiatric disorders across the board.
Thank you very much for participating in this program. Thank you very much for your energy, for your enthusiasm, for your leadership, and for contributing to science.
Bye for now. I'm sorry that I couldn't join you in person, but I look forward to learning of your deliberations.
GEETHA SENTHIL: Good morning. This is Geetha Senthil, previous program officer for Genes 2 Mental Health Network. It is my pleasure to be presenting opening remarks at the final annual meeting of Genes 2 Mental Health Network, which we began in 2019. We used to call it as NIMH rare genetic disease network then. I regret that I couldn't be in person for this meeting, and I particularly want to thank you and congratulate you all for shepherding this valuable effort on rare genetic diseases in neuropsychiatry, despite the challenges we've faced due to pandemic in the last few years.
So after a decade of working at NIMH, I recently moved to the National Center for Advancing Translational Sciences as deputy director of Office of Special Initiatives. As many of you know, NCATS has a large trans-NIH rare diseases program. I hope to meet some of you there, that you are part of that program at some point. Thank you.
Rare genetic mutations, as you know, significantly increase risk for a variety of neurodevelopmental psychiatric disorders. These rare mutations, though they are individually rare in aggregate, they have substantial impact on disease risk, which this is a graph taken from Stephan Sanders' paper published in 2019 Nature Medicine, which highlights this aspect.
Prior to 2019 or before we established this network, the research community used to coalesce around individual rare diseases, such as 22q, 3q29, for study of psychosis, schizophrenia, 16p11 for VIP Consortium and other single gene disease disorders. The study of these were done individually, and with the realization that one variant can increase risk for multiple psychiatric disorders and more often they're better understood through dimensional symptom assessment, we established Genes 2 Mental Health network. Despite challenges with pandemic shutdowns, we have all worked very hard to make this endeavor a success.
We have collectively paved a path for precision medicine in psychiatry by establishing the team science framework to show to the field how this can be done collectively, by creating an immensely valuable harmonized approach to studying the neuropsychiatric symptoms in patients with rare diseases.
I hope your collective efforts continue to improve our understanding of rare genetic neuropsychiatric disorders. The fine-grained clinical information you have collected and providing this data as a resource will greatly help guide targeted therapeutic strategies, as well as provide better insights into underlying disease-causing mechanisms.
With that, I take off now. I hope to see you in the future, if possible in person. Thank you.
JONATHAN PEVSNER: Good morning. I am Jonathan Pevsner. I welcome everyone to this Genes to Mental Health Network meeting, and I'd like to begin by discussing the genomics priorities of the NIMH and say that the work of this network closely matches these priorities. These include the discovery of common variation underlying mental illness, as well as the discovery and characterization of rare genetic variation, characterizing genetic and phenotypic variation across diverse human populations, and defining the genetic architecture of mental illness. More broadly, I think that the understanding of the relationship between genotype and phenotype is fundamental. Some of the largescale genomics project, in addition to G2MH, include the Psychiatric Genomics Consortium, PsychENCODE, and more recently, the Ancestral Populations Network.
So thinking about Genes to Mental Health, I'd like to begin with the gene side as we look at the set of chromosomes and look at some of the presentations on the schedule for today, and these include talks by Jen Mulle on 3q29, chromosome 15, 16, 17, 22, different kinds of particular CNVs and sex chromosome aneuploidies by Armin Raznahan. All of these particular CNVs can be understood on their own or even better collectively, as done by this network, and Genes to Mental Health is also focused on the phenotypic consequences of these different CNVs.
So a paper was published by Jacquemont et al describing the work of this network, and I'm quoting from some of the sections of that paper to say key fundamental questions are do rare variants exert specific or shared effects on psychopathology? What is the extent of phenotypic variability? And another question, an elusive phenome, closing the gap between the tidal wave of gene discovery and phenotypes. These are all questions of great interest.
We look forward to discussions of phenotypic consequences in more detail today, such of measurements of relevant traits and diagnostic criteria, DSM and RDoC, different types of phenotypic measures.
And then I'd like to, in the last slide, say that where this is going is in several directions, but thinking about effect sizes of CNVs, influences on cognition, development trajectories, moving towards treatment and precision psychiatry, as described here at the lower right, at the lower left an image borrowed from SCHEMA, the schizophrenia consortium, in which we can describe the genetic architecture of different kinds of mental illnesses, and here on the x-axis is the minor allele frequency in the general population, with G2MH focusing on rare variation, and on the y-axis the odds ratio, where you can see the CNVs really having very large effect sizes and are very important to a broader understanding of mental illness.
I'd like to, with that, conclude and thank you for joining this meeting, congratulate you on your progress. We look forward to the presentations, and I'd like to welcome anyone to reach out to us. For genomics related issues in particular, I'm listing myself, if you'd like to contact me, Miri Gitik, who is a program officer in the Genomics Research Branch, and for G2MH, Amanda Price, another program officer in the Genomics Research Branch. We have additional NIMH staff with genomics expertise as well, as part of G2MH.
Thank you again, and I look forward to the day's proceedings.
SHELLI AVENEVOLI: Hello, everyone. I am Shelli Avenevoli. I'm the deputy director here at NIMH, and it's really my pleasure to be here, especially with those in the room. It's great to see some familiar faces, and it's my pleasure to thank you, also, for coming here for those who were able to make it, and for joining us for those who are joining via the webcast. This is, I think, our third meeting now in our bright new space. So I hope you enjoy it and feel comfortable for those who are here.
For those who know me, my interest and passion has always been in human development, particularly during the first two decades of life, and just a passion for youth mental health more broadly, because of the promise to be able to identify risk early, to identify mechanisms or targets for treatment and preventive efforts, and to essentially, the potential to mitigate the long-term impact of mental illness across the life course.
So this really gives me a special perspective for today's meeting on Genes to Mental Health in this more of a genomics forward-looking approach. You won't be surprised to know that for many decades, NIMH has funded research that's focused more on this case control study approach, where we've looked at cases, patients, with clinical disorders or groups of individuals at risk for a disorder and compared them to controls of some nature, and this has been important, because it's let us know about some of the behavioral, biological, neuro, and other correlates of mental illnesses.
But there are pros and cons to this approach, and one of the challenges is that we know that mental disorders are heterogeneous and that there are multiple pathways to risk and multiple pathways to the onset of psychopathology. Another challenge with this approach is we cannot really see or be able to identify some of those earliest precursors to mental illness, and particularly those factors, whether they're behavioral, cognitive, emotional, or even early symptoms that are not quite observable yet. So the opportunity to identify risk before we actually see observable symptoms.
So the approach all of you are here to talk about today, more of this genomics first approach, is more of a bottom-up approach that starts with well-defined genetic variants, such as CNVs as we've talked about, and focuses on conveying the neurobiological, behavioral, and clinical sequelae that unfold over time and that may confer risk for mental illness.
A potential strength here is that it facilitates prospective studies that can elucidate changes that precede clinical symptoms as I mentioned, and hopefully many decades before we actually see the onset of mental illness. The use of these kinds of prospective longitudinal studies also allow us to better understand the trajectories of mental illness, to identify the mechanisms that underly mental illnesses, hopefully help us identify predictive biomarkers, either for screening or for tracking the treatment, progress of treatment over time as well as other novel targets for interventions, and particularly preventative ones.
Also, beginning early in the trajectory allows us not only to identify risk that correlates with future diagnosis but to identify protective or promotive factors.
So in addition, another area that we're highly prioritizing right now is to be able to identify early critical periods of intervention, marked either by rapid neurodevelopment or plasticity or social maturation, and also to be able to disentangle the role of environmental factors such as social determinants of health, family and community context, and comorbid health conditions, as well.
So we're really excited to hear about your progress. I'm going to be here for just a little while today, but I'll also be watching some of this online, and really excited that the community is listening in to learn about your progress, but also about the resource that is available to everyone.
So again, thanks so much for joining us and thanks for your attention this morning.
DONNA MCDONALD-MCGINN: Good morning again from the Genes to Mental Health Network. I am Donna McDonald-McGinn from the Children's Hospital of Philadelphia and University of Pennsylvania, and together with my co-chair Jonathan Sebat from UCSD, we'll be introducing this morning's speakers.
So it's my honor and privilege to introduce our first speaker; Dr. Raquel Gur, from the University of Pennsylvania, is going to be presenting integrating insights from the gene first approach in 22q11.2 and 16p11.2 deletion and duplications to neuropsychiatric disorders. Dr. Gur.
Integrating Insights from the Gene First Approach in 22q11.2 and 16p11.2 Deletions and Duplications to Neuropsychiatric Disorders
RAQUEL GUR: Thank you, Donna, and all. I am presenting on behalf of Project 2 that is well presented here.
I will give you first background. Why did we decide when we pulled ourselves together across multiple sites, it's a global, North America, and European collaboration. Why did we think that the gene first approach is going to be worth the effort of investing in data collection? So Project 2 is data collection, which is a deep breath, a lot of work, and then at the end being able to integrate data.
So as we all know, the rare copy number variants are highly penetrant genetic risk factors. So I immediately positioned to specific loci that we know are associated with very prominent neuropsychiatric features, which is a domain of interest for NIMH. So what we see across loci is developmental delay, cognitive impairment or decline, intellectual disability, attention deficit hyperactivity disorder, anxiety is prominent, autism spectrum disorders, features of them, and psychosis spectrum disorders.
So it gives us an opportunity from the get-go to assess, follow, and in our case prospectively multiple pertinent neuropsychiatric disorders. We can focus on the lifespan perspective here, because we can evaluate kids from early development and establish development trajectories with sufficient power to parse the heterogeneity. Heterogeneity is a marked feature of rare CNVs.
Also it provides an opportunity to focus and translate animal models, for example, of 22q11.2 deletion, iPSC, moving across domains from clinical neuroscience to basic cellular molecular, which enables us to probe mechanisms. Furthermore, the findings from rare CNV are likely to be generalizable to neuropsychiatric presentation in the common variant space, so many of us compare with common tools in rare CNV loci and then in common variants, which is part of the psychiatric genetic consortium, to look at parallel. So generalizability is important, and it provides us with a step toward targeted therapeutic.
So in the bottom, you can see how we are attempting in this project to create bridges between deep clinical phenotyping to genomics that are available for all of the participants, and we focus on neurobehavior and extend it to the environment; other studies individual R01s, collaborative, look at multimodal neuroimaging and iPSC. Other translational measures.
With this background, you see why we were motivated to work hard as a group and we build upon a previous collaboration, IBBC, that focused on 22q11.2 supported by NIMH that focused on 22q11.2 deletion, and in response to the RFA, the mandate was expand, look at other informative loci. So other members, investigators, have joined the group and it has been a good, exciting partnership.
We span multiple institutions, from Philadelphia, Penn, and CHOP, located on the same campus, Los Angeles, UCLA, UCSD as a genomic site, Toronto, with three investigators, Montreal; on my left Cardiff, Maastricht, and Leuven. I want to welcome dear colleagues in Europe who are joining us virtually today. So this was the group that was tasked with putting the grant together, which was the easy part, I will say, to implementation.
So when we put the grant together, we had several specific aims. One is to apply common dimensional phenotypic measures across the four CNVs. So 22q.2, 16p, deletion and dups, we had four groups, applied the same dimensional phenotypic measures, and cognitive evaluation that is relevant in neuropsychiatric disorders, categorical, and dimensional.
Look at the genetic determinants of psychopathology in these CNV carriers. What are the effects of genetic, to get a glimpse, which is hard to accomplish, of genetic and environment. There's a reason that we don't study genetics or environment and we focus on genetics. If we think the genetics are complex, environment can be a mess. So how do we come with common measures?
And do all of this responsibly so we can put it high quality data for investigators and in the public domain, and also an important point that guided some of our efforts is to be cognizant of our stakeholders so listen to the families, give them back, and conduct research, for example, during the COVID period of how did it impact them?
So, for the phenotypic assessment, I said it's easy to write, relatively. It's harder to implement. So today I will just highlight briefly, and my colleagues later on are going to talk more in detail of each of the behavioral domains.
In neurocognitive assessment, including the computerized neurocognitive battery, psychopathology, screening, across disorders, with emphasis on adding measures that relate to psychosis so we can look at people at risk for psychosis, ASD, family assessment, and looking at life events. This was an expansion of what has been done previously.
For the cognitive battery, you'll hear about it later on, it measures executive control related to dorsolateral frontal lobe function, episodic memory, complex cognition, social cognition, and sensorimotor, multiple tasks for each, and it takes about an hour, about. It depends on the abilities of the individuals. In our study, lower age range was 7, no limit to upper age.
This is just an example of what we call to the study participant, especially the kids' computer games, and they like it. And we illustrate the domains of executive function at the top. Social cognition, identifying faces and emotions on the face, and as I mentioned, it takes about an hour, and Reuben later on will go more into it, but I'd like in the public opening is to allude to it.
What is the structure and the workflow of pulling it all together? So the European sites, Cardiff, Leuven, and Maastricht are data collecting sites. CHOP and Penn, data collecting site, Montreal data collecting, and UCLA, and Toronto, data collecting but also for genomics, and UCSD is with Toronto for the genomic. So we have formed working groups and, in addition to the session where all investigators of the project are meeting regularly, we also have the specific working group.
So we have the phenotyping group that coordinates with the data coordination at Penn, and that was added also to combine to integrate with Project 3, an important component of the network is to go across projects that were formed independently when we were joining the network.
And then the genotyping that is done in Toronto, TCAG, the genetic analysis at UCSD, and integration between the phenomic and genomics, we have added to the project NIMH has supplemented us to add familial. We thought that it would be highly informative for genomics and for the environmental measures to look at parents, family members, that undergo the global screening array, but their probands go through all genome sequencing, and from that, we have the extract, transform, and load the data. You can see on the right here. It undergoes QC, and to NDA. So data, all data, phenotypic and genomic, is there.
I said it's relatively easily; for each of these component, you have to do a memorandum of understanding. So this is the process it takes. I see Jacob moving his head. It's a process that it takes to amend this amount of time the connection between all the sites.
Okay, so for the implementation with this plan in mind, with all the agreements in place, we move toward establishing -- and this was in the first meeting here and you'll hear about it more in the afternoon -- a core psychopathology, training across sites in the CNB, the computerized neurocognitive battery, with appropriate language translation. There's English, but there's French and there's Dutch versions, all the coordinators were trained and certified in administration of the battery.
All phenotypic data comes to Penn and, as I said, links Projects 2 and 3 and (inaudible) communicates via Slack with coordinator to address questions with ongoing communication, phenotypic QC and linking to genomics. So that's where we are close to as far as the process ongoing and being ready for data analysis.
This is the genomic data flow. Blood or saliva, ship to Rutgers, from there directly to Toronto or directly to Toronto. So the process is all well integrated across the sites and now moving along beautifully, I will say, compared to the initial phases.
The core phenotypic measures that you will hear about tomorrow, later on, looked at domains of behavior, prospectively collected, that look at development milestones, adaptive functioning, cognitive IQ, available educational attainment, and symptoms across major domains of psychopathology, in addition to medication and measures and anthropometric measures that are available for the participant.
So there are several main properties in putting it all together in a meeting, and you will hear on the establishing reliability, is that making decisions of how to integrate across sites because each site is their own history and expertise in evaluating study participants. How do you put it all together in a consistent and prospective way, examining all available data, observation of the participants, collateral, medical records, school reports, and adding calculating it in a way that is consistent across the site. This took a tremendous effort, and you will hear about it later in more details in the phenotypic working group was led by Jacob Vortsman and Carrie Bearden, with phenotype as contributing across the sites.
The disorders I mentioned already and the consistent scoring system that underwent careful review.
Okay, so where we are at now, as of Friday afternoon. We have collective data on 1,927 study participants. You can see of them over 1,000, 1,100 were individuals with deletion or duplication at the specified loci. You can see there's a similar distribution of males and females, will enable us to look at sex differences. We have family members who are not affected. For the age, you can see the range, the median, and of course the parents are older than the kids, but we have a range from 7 to 66. So it will enable us to look at, when we examine the data, it will be able to look at age bins, look at them developmentally.
While we make an effort to recruit, from all diverse populations, most of the participants you can see are white, close to 86 percent, and that's the distribution at the site that collect the data.Sometimes we wonder what is the reason for it; are for example Blacks protected. We don't know. But as studies expand in Africa, we will be able to address that.
Okay, now broken down by loci, what you can see is that in the frequency of these loci distribution in the population, our sample for 22q deletion is over 600 people, probably the largest sample put together. We have less of the deletion of the 16p deletion and duplication and of the 22q dups. This sample is growing, and it will continue. We all said that during the year of low cost extension, we will allocate fund and we will continue to collect the data. But you can see, this gives you a glimpse of what do we have per group. So we have to be careful with some of the data that I will show you.
So, all right. This is the relation of psychopathology domains in our project in the network. What you see in the top is the 16p deletion, on the right 16p dup; below, 22q deletion, 22q dup. And these are presented from each of the loci the participants participating group in the loci, what percent of them show bipolar disorder features, anxiety features, psychosis, ASD, and depression. They're all laid out the same, and you can see that if you go to the deletion in green, they have the most marked, if you look just the delineated figure, you can see that they show most of the psychopathology, most of them anxiety, ADHD, and psychosis over 25 percent. That's what the literature suggests for 22q deletion. Okay?
But they show about 50 percent of them significant features of depression, so 22q11.2 deletion is enriched for psychopathology across the board. There are many kinds of morbidity. Look at it compared to the dups. Much less.
Now if you look at the 16p, you can see that there is across domains, but much less than the 22q deletion, and it's smaller than in the 16p dups. But that's the first attempt to relate it and see what is in common and what differentiates.
And then, this is coming from the genomic working group with Amanda who works with Sebat and just it's very preliminary, but I think looking at preliminary is important. How does it relate, the psychopathology that I've just shown you, in Project 2 of Genes to Mental Health to the Psychiatric Genomic Consortium.
The Psychiatric Genomic Consortium, now we are talking what percent effect size, you've done it the right way. Our sample is too small to look at it for some of them. But this is large enough, and what you can see is that the pattern, if you look at the effect size and now we have ASD, bipolar disorder, major depressive disorder, PTSD added, and these are adults, primarily adults. Schizophrenia and ADHD where you have more kids, is that you see that there's significant for the 22q duplication in blue compared to the 22q deletion more mark psychopathology in the PGC. Similar to what we have seen here in our own study with the small sample and some of them more dimensional.
So the pattern is similar, which is encouraging and a way of integrating rare variants with common variants, which we would like to do at the end.
Where we are moving forward, continue with data collection. We're committed to it so that our sample will be as large as possible. Ongoing QC, so we post rigorous high quality data for analysis in public domain, and of course, integration within our project, but also with other projects. For example, Project 3, data analysis and manuscripts are on the way. There's been the paper by Sebastian on behalf of the entire consortium, but there are several papers that are now in submission, and capitalize on the experience and data that we have available.
What lessons learned of how do you learn such an effort? How do you move prospectively, which is very different than retrospective studies that are just hard to force, but here planning together. Method development. What is the feasibility and how much does it cost?
In answering the clinical relevance by giving back to the clinicians, and I mentioned the families, the stakeholders, and gearing toward clinical trials that require expertise and ability to work with families and to apply rigorous approaches that can meet FDA standards.
With that, the last slide, thank you to all of Project 2 investigators, to the dedicated research teams that have worked with them, to the study participants and the families without whom nothing would have happened, and to NIMH for working with us in a new mechanism and supporting us, Lora, now Stacia, keeping us on our toes and contributing. Thank you very much.
DONNA MCDONALD-MCGINN: Thank you so much, Dr. Gur. We do have a slight change in the order of presenters. So now we're going to go to Dr. Christa Lese Martin, and she is going to present Leveraging Rare Genetic Diseases to Advance Knowledge and Treatment of Brain Disorders. Dr. Martin.
Leveraging Rare Genetic Diseases to Advance Knowledge and Treatment of Brain Disorders
CHRISTA LESE MARTIN: Thank you. Good morning, everybody. I'm happy to be here on behalf of Project 3 to present our progress as well as an annual update on behalf of all of our team, and I want to acknowledge their contributions up front, including Dr. David Ledbetter, who serves as MPI with me on this grant.
And as just introduced, we've really focused on leveraging rare genetic diseases to try and advance knowledge and treatment of brain disorders. As you just heard Raquel say, we all have an eye toward someday when we'll have clinical trials that are ready to be conducted in individuals with rare genetic diseases, and we hope to build the datasets to have ready to go as those trials emerge.
So I just want to begin by introducing our study team on the next slide. Our team at Geisinger is led by myself and David Ledbetter, as I just mentioned, who recently has transitioned to the University of Florida. We have a broad multidisciplinary team, including clinical psychologists, neurodevelopmental pediatricians, geneticists, et cetera. We also have colleagues at the University of Washington, led by Emily Neuhaus and Evan Eichler, as well as John Constantino who also when we started this grant was at Washington University in St. Louis, and his team members from there are acknowledged, as well as now he has transitioned to Emory University.
And then we've brought in some affiliate groups, which we're really excited about, including Santhosh Girirajan from Penn State, Jen Mulle from Rutgers, previously at Emory, and Daniel Moreno De Luca, who has recently transitioned the University of Alberta, and we brought those individuals in because they're building large cohorts around particular CNVs, including 16p12.1 in Santhosh's case, 3q29 deletions and duplications in Jen's lab, and then Daniel Moreno De Luca focusing on 17q12 and 15q deletions and duplications.
So we are really excited about the team we've gotten to work together, as well as across the network. I will point out that one of the things in addition that we focused on to adding affiliates was really trying to focus on developing trainees and young investigators in this space, and highlighted in blue are some of our early investigators or trainees, and one of the areas that we tried to grow in was really bringing child psychiatrists onto our team to help them learn about genetic etiologies and how to incorporate genetic testing and this type of information into their clinical practice.
So with David Ledbetter at University of Florida is Taka Soda. John Constantino has been training Cory Patrick, who is now moved on to UCSF, and then we have Daniel who is not early investigator, but another seasoned child psychiatrist working with us on the genetics of these conditions.
So the goals of our study are listed here and one of the things that we wanted to do was really try to leverage existing clinical data for driving our research. So we have developed Aim 1 to really capitalize on data collected as part of clinical care, including leveraging electronic health record data and other testing that's done as part of routine evaluations. Under this aim, our goal was to recruit over 1,000 individuals who had any rare genetic disease. So these would be identified as they had clinical genetic testing that are known to be causative of brain disorders.
And our second aim, we became a little more specific and had a set of targeted copy number variants, as well as single gene disorders, many of which overlapped with other members of the consortium. Those are listed here, including 1q21.1 deletions and duplications, 16p11.2 deletions and duplications, 22q11.2 deletions and duplications, 15q11.2 deletions and duplications, 15q13.3 deletions, and one single gene CHD8 and the variants that were included in that.
And in this aim, our goal was really to assess these families who are living with these rare genetic diseases and do more detailed phenotypic studies, adding onto the general information that was collected as part of routine care. So really extending our phenotype battery.
And then in Aim 3, our goal is to try to understand other genetic contributions to brain disorder and severity. So this goes to a broader theme of our consortium, trying to understand variable expressivity and why individuals with the same genetic etiology present with different phenotypic results.
So some of the things that we've been starting to evaluate are looking at the contributions of polygenic scores to try to explain risk and resilience that can be attributed to family background, as well as looking at second hits across the genome from other rare genetic disorders that could also contribute to this variable expressivity.
As I mentioned, one of our goals was to look at how we can leverage research or improve research embedded within clinical care. So if we start on the left side of this figure, when a patient is referred for suspected neurodevelopmental psychiatric, or NPD, concerns, we have a system in our clinics where parent and guardians complete core assessment surveys to help inform their diagnosis, and some of the surveys that are conducted are listed there.
We also do cognitive testing, completed during the clinic visit, to arrive at an NPD diagnosis that is made by our multidisciplinary care teams across our sites. When an NPD diagnosis is made, that informs the ordering of clinical genetic testing and currently most of our sites are doing exome sequencing as that first genetic test that's being offered to look for both copy number variants, as well as single gene etiologies, and once a genetic etiology is identified, that is the point where we offer broad research consent to the patients and their families to not only enroll them in the current study but allow for future research recruitment as we go on.
So the clinic patients with genetic diagnoses can then be recontacted for research under this study, and we use the existing genetic, health, and assessment data that's been collected as part of routine care and then can also apply that to other studies going forward. As I mentioned, part of this research is to add additional assessments to augment the existing clinical data that's more specific to the actual research questions being asked.
So just as an example, at the Geisinger site for the G2MH clinical recruitment, we've actually recruited more than 75 percent of our G2MH participants through our clinics, and they're recruited after receiving a positive result from clinical genetic testing. Most of these patients completed their core assessments as part of clinical care and this use of research data, we found from garnering clinic information, actually reduces participant burden and also saves grant funding, allowing it to be used for more detailed other types of assessments.
And this table just shows the proportion of assessments leveraged from clinic, and you can see that we're actually quite successful in collecting the data from clinical care for use in our research endeavors.
So this slide just lists both our core domains that are evaluated, as well as our expanded domains, covering cognition, social skills behavior, adaptive skills, and language, and expanded started to look at executive function, visual motor skills, schizotypy, and neuropsychiatric history, as well as more detailed medical history.
The assessments that are bolded are ones that are shared with project 2, and as Raquel alluded to, we really have worked quite hard to try to align our projects' data collection in the area of phenotyping as much as possible.
So this is an update on our recruitment numbers. You can see for Aim 1, this is the one that's any rare genetic disease identified as part of clinical testing, we've now recruited more than 1,700 patients, and then in Aim 2, where we're specifically focusing on those eight genetic etiologies, we've now recruited 363 probands and also 400 family members.
Family members will become important when I speak about a little later some of the evaluations we're doing looking at family background and how it influences the phenotypes that are expressed in probands with these conditions.
And you can see for each of those, both copy numbers and single gene disorders, that we're now amassing the numbers that are needed to do some of the statistical analyses and have the power to do evaluations that we've proposed in our research study.
The next two slides are just showing what data we've collected as far as the assessments related to the core battery as well as the expanded battery. You can see in the first line here for any rare genetic disease, we are starting to get quite large numbers in regards to some of the individual assessments. For example, in IQ, we have over 710 individuals where we have IQ assessments available for analyses.
And then below are also the individual deletions, duplications, or single gene variants in CHD8 where we also have that data, as well as we're able, we collect family members' assessments as well. So this is setting us up for analyses to either look within or across genetic etiologies to examine the effect of a genetic etiology on human disease.
This slide just shows similarly the numbers that we've collected now for those assessments that are included as part of our expanded assessment battery. Again, not as large yet, but getting to be of the size where we can start to do some of the analyses that we've proposed to do.
Since the last year and our annual update, our group has continued to be productive and has had several publications which appear on the next few slides. I'm going to highlight the work of two of these studies, but also want to just point out that in general we've worked in areas either defining new phenotypes that are seen in some of these genetic conditions. So we have focused on cerebral palsy as a cause of NPD as well as motor/speech disorders and started to do more genotype/phenotype analyses in those conditions.
Our colleagues at the University of Washington under Evan Eichler have continued to look for new causes of NPD, as well as describing some of the genotype/phenotype associations with some of the existing genes that are known. We've also worked on sex chromosome aneuploidies and from a genotype first perspective have shown that these individuals are actually at risk for venous thromboembolisms or blood clots, which has significant findings related to their health as they get older.
And then we also have done work, cross-collaborative work with the G2MH consortium in several areas, including looking at how COVID impacted individuals with 22q11.2 copy number variants, as well as looking at the community's perspective and looking at experiences and opportunities in rare genetic disease research.
As I mentioned, I'm going to highlight two of the studies that we've done. One is part of GeneReviews, which for those of you who are not familiar with GeneReviews, this is really a publication that defines the gold standard outline for diagnosis, treatment, and management for guidelines for patients who have specific genetic etiologies. CHD8 is one of the single gene disorders that we've worked on, led by the University of Washington team at Emily Neuhaus and Evan Eichler, and we realized that while this is one of the most common genes that's observed to have changes that cause neuropsychiatric developmental disorders, that there was no gene reviews on this particular genetic etiology, and so our team led by Marissa Mitchell did a review of the literature, surveying more than 45 published studies about CHD8, including studies that were conducted by our team members, and summarized the known clinical features that are seen in association with this particular genetic change, which we called neurodevelopmental disorder with overgrowth, given the more common features that are observed.
And you can see in the box on the right some of the phenotypic features that are associated with changes in this gene, and particularly when this gene was first described in NPDs, it really was sort of used as the example of an autism gene, and I think over the years from doing more studies and looking more broadly at phenotypes and trying to remove biases that were inherent in some of the early studies, you can see that actually other developmental disorders are just as common as autism. So again, speaking to variable expressivity of these developmental brain disorders where you can see a smattering of different clinical phenotypes. I think that is reminiscent of the 16p11.2 deletion that a copy number change, again when it was first reported, it was reported as a change seen in autism, and now we know that the phenotypic spectrum is much broader in the sense of looking at brain conditions observed with that.
I also wanted to highlight recent work led by Cora Taylor that continues on a theme that our group has focused on, which we've called phenotypic shift in looking at copy number variants or rare single gene disorders. We published our work in Genetics in Medicine earlier this year, really to expand on some of the previous work that we and others have done to look at variable expressivity.
So one of the main questions that our grant has been trying to address is why do individuals with the same pathogenic CNV, and that's even with families, exhibit variable phenotypes. So if we just take this pedigree as an example, many years ago somebody might have looked at this and said this family has a really unfortunate family history. They have a lot of different brain conditions. And assumed that they were all caused by different things, because they're different clinical diagnoses.
But if you think about it from the perspective of being under a broad umbrella of brain conditions, you can understand that all of these phenotypes can be caused by a single CNV, and so now we understand that we need to look at this phenotypic expressivity, but we need to understand it better.
We and others have previously published on 16p11.2 deletions and 22q11.2 deletions. Many of the colleagues sitting in this room or on Zoom and affiliated with the G2MH consortium, trying to understand family background and the influences that it portrays on clinical variability in these genetic neurodevelopmental conditions. As I said, and we coined a cute name for this, title for this, called shift happens. But basically, the idea is if you look at these two families, family A and family B, both who have a child with the same CNV, and here we're using IQ as a quantitative trait to examine this shift phenomenon, the blue curve represents the normal distribution of intelligence quotient or IQ in the general population, and then the smaller orange curve shows the IQ distribution for individuals with a particular copy number variant. And the circles on each curve indicate where the IQ scores are for specific family members.
In both family A and B, you can see that this particular CNV, that's the same CNV, confers the same magnitude of deleterious impact or shift on a child's IQ. In family A, the CNV shifts the affected child's IQ range into a diagnosis of intellectual disability, whereas in family B, because the family IQ starting point is higher, the CNV shift does not reach the defined threshold for a diagnosis of intellectual disability, even though the shift or the deleterious effect is the same size.
So the other important thing to consider here is that even in family B, because the family IQ starting point is higher -- oh, sorry, I just repeated what I said.
So as I alluded to, we published on this in 2014, specifically focusing on 16p11.2 de novo deletions, and here you can see real data points as compared to the cartoon that I showed on the previous slide, but basically demonstrating the same phenomenon where you can see the parents and siblings in the gray and yellow compared to the probands that there is a shift in the probands that is about 1.7 standard deviations for FS IQ, and on the right side, the severity increases from left to right, because this is based on SRS, another quantitative trait that's known as the Social Responsiveness Scale, and that the shift is about 2.2 standard deviations from what is expected based on the parent and siblings performance.
So we wanted to, in addition, you can see from this slide that the interclass correlation between the probands and the family members were maintained, suggesting that an individual's baseline functioning related to family background influences that individual's phenotypic expression. So what we were wanting to observe related to family background, basically that in individuals whose parents have higher IQs, the children of those individuals also have higher IQs. So demonstrating the contribution of family background to these quantitative traits.
So given that, we wanted to see if we could replicate that in other CNVs that we are focusing on as part of our G2MH study, and so Cora led work to look at evidence in 16p11.2 duplication syndrome. Again, I'm showing data just for the full scale IQ score and for the Social Responsiveness Scale raw score, but you can see that we see the same phenomenon that we can identify a shift of about two standard deviations.
In this case, we combined all first degree relatives and compared that to probands so that we could confirm that these types of shifts are evident when we look at the difference between family members and probands with these genetic etiologies of NPD, and we did this data analysis by doing specific phenotypic assessments on family members, and we know that that type of assessments are tedious, as everybody in this group has done, and so while it's a useful research tool to understand these changes in phenotypic expression, it's really impractical for trying to do going forward, and do on larger scales to examine more individuals.
So recently, polygenic risk scores have been more directly investigated to look at this family background contribution to variable expressivity, and clearly further refinement of these types of analyses could be very useful, and what we hope would be able to narrow the prognostic diagnoses in these individuals. So starting to explain variable expressivity more clearly at an individual level by allowing the magnitude and type of NPD risk in these individuals to be identified so that clinicians who take care of them can have a more proactive approach to targeted interventions, rather than waiting for symptoms to emerge and not being able to predict which is more likely.
From these analyses, what we've learned about risk and resilience for NPD is that these types of rare copy number variants or single nucleotide variants have large primary impacts on neuronal pathways, and we view these as causative of NPD. We've also shown that there are large effects and phenotypic variability that are modulated by family background, combining genetic and environmental effects. As I mentioned, we are now and others exploring how that genetic component is measured by polygenic scores, and how the common variants contributed additive effects to those sort of instigating rare variants.
Of course, there's always stochastic or other environmental effects that can continue to modify the phenotype and contribute to variable expressivity.
So our G2MH project through the recruitment of individuals with these rare genetic etiologies, as well as doing our core and extended phenotype batteries, has now established a cohort that contains structured genotype and phenotype data which we hope will continue to allow these types of analyses to be carried out, with an overall goal of better quantifying the type and magnitude of NPD risk in individuals to enable a more precision health approach to targeted interventions in the future.
And I'll just finish up on the last slide by highlighting some of our work in progress. So I mentioned that we have particular CNVs that we're focusing on to do these more detailed genotype/phenotype analyses. Two of the child psychiatrists who I introduced at the beginning, each have taken on a project related to either 1q21.1 deletions and duplications at UCSF or 15q13.3 deletions at the University of Florida.
In the case of 15q13.3, we've actually started a specialty clinic where we invite participants with that particular deletion to come for a multidisciplinary clinic where we use that to gather some of our additional phenotypic batteries and get information on family members. So we have one of those clinics so far.
We're also looking at a case series on adults with CHD8 variants. So this is a collaboration between the Geisinger and UW sites. Geisinger's MyCode genetic study has an adult cohort with different changes in genes that we know cause NPD. So we're using this genotype first approach to look at a broader spectrum of what type of phenotypes we might see in adults with these conditions that you might lose due to biased ascertainment of more severe cases that are seen in a pediatric setting.
We also have a study led by Brenda Finucane at Geisinger looking at dementia risk in some of these rare genetic conditions, and then later on in this meeting, you'll hear from two of our team members, Marissa Mitchel from Geisinger and Corinne Smolen from Penn State, working with Santhosh Girirajan on rare genetic etiologies of pediatric motor speech disorder and then more about the rare and common variant effects in 16p12.1 deletions.
And then finally, just highlighting some of our ongoing G2MH collaborations across our consortium, looking at harmonization of data using REDCap and our data collection center, looking at a combined analysis for copy number variants and PRS. The work that was mentioned earlier and is now publication submitted, related to the core psychopathology summary, so where we've tried to harmonize the work that we're doing in phenotyping individuals across sites, and the behavioral and social communication profiles in 16p11.2 CNVs.
I would like to stop there, thank our team members, and thank you for your attention.
DONNA MCDONALD-MCGINN: Thanks so much, Dr. Martin. Moving to our next speaker, again we have a change. Dr. Elise Robinson from the Massachusetts General Hospital and Harvard Medical School will be presenting accidental investigations of 16p and 22q.
Accidental Investigations of 16p and 22q
ELISE ROBINSON: So, Project 4 actually in part supports the NeuroDev, but Jonathan asked me to talk about something different, and I'm sure he'd be happy to receive any concerns or complaints about that request.
When we received funding for Project 4, the lab wasn't doing anything particularly specific to 16p or 22q. It's just a series of collaborations that emerged quite accidentally, hence the title of the presentation around the same time, and then we just happen to be members of this consortium.
So to provide a brief introduction, since we're in our rare variant space conversationally at the moment, autism as we all know is influenced by rare and common variation. The majority of the contributing genetic influences are common, though decidedly more difficult to pin down and interrogate, and there are different ways to estimate kind of how much the rare variant signal can tell us about the common variant signal.
One of them is through techniques like the AMM model that was developed by Dan Weiner when he was working with Luke O'Connor as part of his PhD, and we estimate like with most neuropsychiatric complex traits, the fraction of heritability mediated by rare variations is pretty low.
But more importantly, in another project, Dan and another phenomenally talented MD/PhD student, Ajay Nadig, who is an alumnus of the Raznahan group, amongst other things, and was also working with Luke O'Connor, asked the question to what extent do the rare and common variant influences on a variety of different outcomes, both brain and body, seem to point to the same cell types and tissues. This work was recently published, and, in general, they seem to be quite concordant. This is true across the variety of somatic disorders they looked at, as well as for bipolar disorder and schizophrenia.
Which is good news for all of us. They also found that in general, the rare variant influences on a trait and the common variant influences on that same trait appear to have similar suites of phenotype correlations, meaning the cross-trait correlations between rare and common variant influences tend to be the same.
So autism was not included in that analysis, mostly because the majority of our rare variant considerations come from TRIO data and the BHR approach, which is what that paper focused on, isn't currently calibrated for TRIOs, though Luke and Ajay are working on it perhaps as we speak.
One thing to note is the quite curious disagreements between autism's rare and common variant influences, particularly with regard to other phenotypes. So while we don't know through that analysis whether its cell types and pathways are similarly pointed to by the GWAS and rare variant data, we do very clearly know that on average, rare variants associated with autism through exome sequencing are associated with intellectual disability and we know that the GWAS signal, the common variants associated with autism, are on average very positively associated to intelligence in the general population. This is a curious thing that no one fully understands, but at the outset already kind of runs up against what we expect from the majority of complex traits.
And it provides extra motivation for continued pursuit of the autism GWAS, which I co-organized with Andres Borglum, one of these very nice analysts you see pictured here, which is often at this point going to be logistically challenging activity, but I think one in the end that we will appreciate.
Now, going to the accident. So Dan, who was a shared graduate student between Luke's and my group, Dan and I were interested in developing a statistical approach to better understand variability and regional relevance of the autism polygenic score, and we were developing a method called stratified pTDT where we tried to find kind of portions of that polygenic score that were more statistically associated to autism when you compare diagnosed children to undiagnosed parents, and simply over the course of normalizing this method, trying to get a baseline of what any given chunk of the genome should do, we noticed that there was one chunk of the genome that was behaving in a very statistically odd way, and specifically there was only one section of the genome that across multiple cohorts was consistently more associated with risk than it should be, given its size and kind of constitutional properties, like the number of genes, et cetera.
And that was the p-arm of chromosome 16.This wasn't localized to any particular region. In fact, the three blue dots in the top right of this plot are the three across multiple datasets that are consistently significant and they just kind of panel the entire 33-megabase p-arm.
And that motivated us to start looking at it as a unit, and in fact, if you compare 16p to every other 33-megabase region of the genome, you can see this big gap in the amount of association to autism polygenic score in the over-transmission sense, and there are no GWAS hits in there. If you do something kind of simple, like remove little LD blocks all across the p-arm, you can't find any big drop, which suggests that it is indeed a very diffuse association across this entire area.
If you do this, for example, if we go back a slide, the red dots in this plot are all equally -- or partitions that include an autism GWAS hit and, you know, reassuringly, they are shifted up and to the right in terms of their distribution of transmission statistics.
If you were to do -- if we go, can we go back? Thank you. If you do this analysis on the right with one of the sections that includes a GWAS hit, you do indeed see a drop, like a meaningful drop in the amount of signal that is covered in the region.
So this led to a fascinating PowerPoint error, also some great collaborations, with people who have various forms of data to help us perhaps figure out what's going on here. On the left-hand unreadable plot, we have data generated by Mike Talkowski and Serkan Erdin in his lab. They have CRISPR-induced 16p11.2 deletion stem cells. This is in neuronal progenitors, and what we have found in this plot is that the entirety of the p-arm of chromosome 16, the vaguely 250 brain-expressed genes in the region, on average have reduced expression in the presence of the .5 megabase 16p11.2 deletions. There's this tagalong or extended cis effect as we started referring to it as, that occurs across the p-arm of chromosome 16 but not in the rest of the genome.
And in a completely separate data modality and analysis, using postmortem single cell brain tissue data from Steven McCarroll's group, we found that higher polygenic score for autism on chromosome 16 was also associated with on average reduced expression of that same set of genes across the p-arm, and all this is published. It was published last year, and I'll have a citation in a second.
In both cases, we saw the majority of effects concentrated in this kind of pseudo-telomeric region between 0 and 6 megabases on the p-arm, which as those of you who are very into 16p know is actually on the opposite side of the p-arm from the deletion.
And that gave Dan the nice idea to check contact patterns across the p-arm of chromosome 16 in 3D genome space using Hi-C, and on the left plot -- again, this is all in the paper -- every dot on the plot is a 33-megabase region in the genome and on the x-axis, we have the average amount of contact between genes within each segment.
So not between the segments, but all the genes within a 30-megabase segment. That's their average degree of contact in lymphoblastic cell line reference, Hi-C dataset, and then in the y-axis we have it in fetal cortical plate data datasets, so it's just two Hi-C datasets.
And, compared to every other 30-megabase region in the genome, 16p has the second highest degree of contact across those two datasets on average across the genome. The region to the right, this little dot -- oh, you can't see my pointer, I'm sorry. The region to the right of it has higher, and that happens to be our friend, 22q, but I'll get to that in a second. We didn't know that. We just, same with the rest of this project, looked at that dot and was like what's that, and then it just happened to be a 16p and 22q thing, which is kind of how this has gone.
So anyway, when we look at -- if you zoom in on 16p, Dan checked to see if we could explain variation in these extended cis effects based on contact in the region and indeed, this pseudo-telomeric area where the effects are concentrated in both the rare and common variant data is the area that has the highest contact with the deletion region. So the greatest degree of contact across the p-arm is between the deletion region, which is pseudo-centromeric, and this 0 to 6 megabases, which is pseudo-telomeric, and that is where we see the concentrated effects, which made us wonder if we'd stumbled into just something more generally interesting that you could see across CNVs, but obviously we are particularly interested in 22q, given that -- it's this little orange guy right here.
So it's actually the only place in the genome that has on average a greater degree of within region chromatin contact than 16p, and you know, if I were to make a bet, it seems like it's maybe not random that the two classic neuropsychiatric CNVs that have after many years of work never been resolved to a single gene or subset of genes are apparently operated under a noncanonical mechanism. Perhaps it has to do with this sort of noncanonical oddity going on. But the purpose of the future investigation that I'm going to talk about is to indeed figure that out.
So Ajay Nadig, the aforementioned also amazing MD/PhD student working with me and Luke has started digging into this, with our colleagues Ralda Nehme and Matt Tegtmeyer in her lab. In brief, we see the same thing with 22q, where in particularly neuronal progenitor cells but also neurons and stem cells from induced 22q deletion lines, you see these extended cis effects where the 3 megabase deletion, which in this case is now near the telomeric end of the q-arm has extended expression effects that are actually quite evenly distributed all across the chromosome, the q-arm of chromosome 22.
And in a lot of ways, 22q has been easier to look at, not just because we're replicating analyses that we did before with 16p, but because there's more data and all of the relevant genetic signals are stronger. So in 22q, we were able to very clearly see an association between the per gene expression effect on the q-arm of the 22q deletion and that gene's degree of contact with the deletion itself.
So why we were able to look at this regionally in 16p, we could see it at the per gene level in this case, and that's the plot on the right. So as each gene has more contact with the 22q deletion in 3D space, the expression effect of the deletion on that gene increases, and we validated several of the kind of more interesting gene expression changes with qPCR, including SMARCB1.
And indeed, again, the PRS which this time we looked at not just with autism but with schizophrenia and bipolar disorder in lower IQ, all are highly correlated with the expression effects of the deletion.
So next steps, how are we doing to start to puzzle through what's going on?
So there's something kind of intuitive about taking a chunk out of a very high contact region of the genome and seeing expression changes that are highly distributed. It's markedly more confusing what could be going on with the polygenic risk score. So we initially looked at this in the form of just statistical analysis, because you can imagine a model where -- I guess the goal of this analysis was to distinguish between two potential things.
So the first option was I guess within 22q you can imagine a model where in general, just reduced expression on average of genes across 22q increases your risk for psychopathology, and then all the PRS is doing in having this average negative association with expression is picking up on that, that risk comes from on average reduced expression of those genes. The other model is that like the CNV, little bits of common variation are having uncommonly distributed effects across this high contact region. So we constructed an analysis where you break the polygenic score, in this case I think it's for -- it's either schizophrenia or the aggregate we made of multiple psychiatric diseases, because they all looked so similar here.
We broke it into 8 megabase chunks across the genome, because that was the minimum we could go to while still being statistically powered for the analysis, and if you take each 8 megabase chunk and you look at the gene expression effect, the absolute value there of within the chunk, you get an average of this gray dot. So as we would expect, when you take common variation that is associated with psychiatric disease, it induces some expression variability on genes within each 8 megabase region. And it's a little higher at baseline in 22q.
But the more important thing is once you get outside of the 8 megabase region, that expression effect just plummets to below the noise level, really quickly. So typically as soon as you start looking at the expression effects of an 8 megabase chunk on genes outside the 8 megabase region, you don't see an average effect that exceeds statistical noise, but with 22q, it just slopes on and up, over the course of this large high contact region.
So the question is what is it doing?Now we're hoping to generate new Hi-C data from something for example like the BICAN project that has a lot of human variability and PRS that has been attached, some things that we can figure out is the PRS like the deletion probably changing the contact structure which results in this? Or is there something more exotic that we don't understand going on?
And this is another ongoing next steps project being led by Ajay and Luke O'Connor.
Several of these things were supported by a new grant from SFARI, a collaborative track grant to me and Luke and Kaitlin Samocha and Mike Talkowski to better understand 16p. So one of the things we initially noticed when looking at these effects, again this isn't particularly surprising, but opened up a bunch of new fun ideas. As you might guess, when you induce expression variation across an uncommonly large swath of the genome, in this case like vaguely 30 megabases, all those genes have friends and expression partners. So you induce this massive amount of transcriptome-wide dysregulation.
So you basically have a highly over-dispersed gene expression distribution because you're impacting lots of genes, and then those genes impact lots of genes. So there's just this huge genome-wide wiggle that Ajay and Luke gave a much more adult name of transcriptome-wide impact, and they are now in the process of finalizing a statistical approach to quantify it across a variety of traits.
The model is now called TRADE, and it provides a clear statistical estimate of transcriptome-wide impact in the variant sense that allows people to compare between any kind of well-controlled perturbational condition, and I think it's going to be widely used in this and a lot of other senses.
And they can already see very fun things that they were kind enough to let me share here. I'm sharing some of the a little less surprising but good to know things, and one is simply that constrained genes introduce more transcriptional variability. They have greater transcriptome-wide impact. These genes you see, this is just a rank of all genes, or the genes they looked at, from the perturbed seq dataset.
The highest ranked genes are known kind of regulators of many other things. GATA1, for example, is a mass regulator of erythropoiesis. So anyway, the more constrained genes, more transcriptome-wide impact.
But something I really enjoyed was that LOF-constrained genes, while they cause more, they're genetically, they're buffered against it. So highly constrained genes are the least perturbed on average by other genetic events, suggesting a productive and kind of important relationship between loss of function and expression constraint. This will be presented at ASHG and I think a preprint is forthcoming.
We're also going to revisit constraint. So this work is being, as her part of the SFARI grant, Kaitlin is going to take a new look at something that became a point of great curiosity for me during Dan's original analysis. So while the p-arm of chromosome 16 has an enormous number of genes, it is very gene-dense, and it is brain gene-dense specifically, if you control for how brainy it is, it has incredibly low numbers of constrained genes. Like controlling for how many you would expect based on gene density and specifically brain gene density, it is very, very low in terms of the number loss of function intolerant genes, and that kind of excited my autism genetics brain, because the 16p11.2 deletion is one of the only genetic events that remains significantly associated to autism when you remove the cases that also meet criteria for intellectual disability.
So it is among the autism-associated genetic events one of the least IQ impactful for its effect size on autism, which is obviously not to say it's not IQ impactful. It's just less than most of the PTDs that cease to be associated with autism if you remove the cases with intellectual disability.
And it has a concentrated relevance in terms of the polygenic signal and the polygenic, as I noted, is positively associated with intelligence, which made me wonder could this lack of constrained genes be in some way related to its neuropsychiatric impact that seems to be able to exist with less than average cognitive impact?
But then Kaitlin productively yucked my yum and said that she thought it was just because the constraint was estimated wrong. So Kaitlin Samocha, who recently started a group at MGH and Broad is probably familiar to many, because she's responsible for things like pLI and LOEUF and missense constraint and now is continuing to work on those things within the context of her own lab. I'm sure there will be many more fun acronyms emerging from this work.
So she develops mutational models, which predict how much of a certain type of variation you should see and then genes -- this is in the broad sense what mutational models do -- genes that have less of that predicted amount of variation are often said to be constrained. That's the general idea.
And one of the ways you check mutational models is to make sure they're predicting the number of synonymous variants correctly.
The telomeric region, where we see these concentrated expression effects in 16p, performs horribly when it comes to the mutational models in that the synonymous rate is off, and we don't know why that is.So Kaitlin as part of this grant and her new efforts is going to be taking new statistical approaches to try to better model this region and try to understand what it is perhaps more generally about these high contact regions that are making the model fail and figure out if she is right and indeed these genes are more constrained than we think.
I have to mention NeuroDev, because Project 4 actually funds part of NeuroDev, specifically the analysis of NeuroDev South Africa, which is led by Kristy, and thanks to Kristy and her team's work, NeuroDev South Africa has completed selection as of June and we are now in the analysis phase.
NeuroDev Kenya, which is funded by NICHD and NHGRI, is going to continue to collect for two more years. They weren't able to pivot to remote during COVID. So that project is being extended.
NeuroDev is just thriving, and I'd be happy to talk about it with anybody at any point today.
Oh, I remember why I made this slide, other than giving you an update about NeuroDev. It was because Kaitlin is also going to as part of her early lab efforts develop constraint models for diverse ancestry populations with a new NeuroDev data; as we discussed on a recent G2MH call, those don't seem to be working very well.
Lastly, 3D genome characterization. Mike on the 16p side and Ralda on the 22q side potentially in collaboration with Raquel, and Donna and this crew, were hoping to develop a much better understanding of the impact of 16p and 22q CNVs on the organization of the 3D genome to try to unpack exactly what's happening here, and the kind of transcriptome-wide effects of the mutations. Mike has this suite of 16p isogenic lines that I talked about before, and our grant from SFARI is going to permit the 3D genome kind of change and baseline characterization of those lines.
Ralda has already generated Hi-C data from her 22q iPSCs, and we're hoping to do so in much greater number across those two as well as other CNVs and potentially integrate data from human carriers as well as induced mutation data.
That is it. Thank you so much to all of you, and particularly Dan and Ajay and the million people who got excited about this as Dan and I did.
DONNA MCDONALD-MCGINN: So this session is open for questions.
JONATHAN SEBAT: Since Elise, I had some questions related to her talk, so about the duplication. I think most of the data you presented was the deletion, all the deletions. So what happens when you look at the duplication?
ELISE ROBINSON: We haven't. I think the n for the 16p dup in Mike's lab was like 2, and it's much lower in Ralda's as well, but as part of the SFARI grant, we will be looking at the dup.
JENNIFER MULLE: I had a similar question to Jonathan. I imagine -- I had two questions, actually, and one is I imagine do you think given that the duplication -- I mean that the deletion decreased expression, would you predict the duplication would increase expression and would you consider, wouldn't that be satisfying?
ELISE ROBINSON: I don't know. I think we thought about it in a thinking about nothing else sense. But Ralda has pointed out that you can imagine a lot of different ways to break the contact mapping, and from duplications -- I just don't know if we --
JENNIFER MULLE: Yeah. My real question is, so there are single genes out there like CHD8 or FMR1 which seem attractive to study, because it's just a single mutation, but then you think about the fact that CHD8 sort of regulates so many other genes, right? Like you're really looking at a ton of targets, and that's why CNVs seem attractive, because there are a limited number of targets, except your data suggests that maybe not, right? Maybe even when we're looking in a 16p deletion or a 22q deletion, that really the number of genes that are involved are greater than what's in the interval.
Do you think we're just sort of locked into polygenic or oligogenic models no matter what? Is that where this is headed?
ELISE ROBINSON: Sure.
If the question is is there a way to escape complexity, I think the answer is no. I do think, I don't know, I feel like we spend half of our time right now at work talking about where to put our money. But I really like the places where rare and common variations merge, and it could just be completely random that we see this magical convergence of rare and common variation at these two spots. Chromatin contact is exceptionally high in these two spots and they happen to be our spots. But it's not very probable.
CHRISTA LESE MARTIN: Along those lines, has anybody looked at the underlying sequence homology? Because there's a lot of shared sequence homology between like the telomeric regions and the centromeric regions that could be helping to align some of that. That is part of what Mike and Kaitlin are doing as part of the work.
SEBASTIEN JACQUEMONT: I have a comment on what Jen just said and a question for you, Elise. So in my view, you can't have large effect size without polygenic effects. The graph where you show high constraint equals large disruption or then if you look basically the enrichment of all the de novo variants, they're all in chromatin remodeling or they're all epigenetic genes. So there's no escaping complexity.
So I have a question for you, Elise. So now you're like revisiting, if I understand properly and I try to reformulate what you're doing, you're sort of revisiting an eQTL paradigm, but now you're doing like blocks -- so you're doing an aggregate of common variation looking at the effects on aggregate levels of expression when you're looking these blocks in the genome, looking at changes in the expression, within the block and also contiguous to the block.
So it's almost like an aggregate association between common variation and aggregate levels of expression. So how does that -- do you think that relates to underlying eQTLs in there? Or is this like a completely different mechanism which is driven by chromatin structure and contacts? Are these going to be two independent ways of explaining variation in transcription variation? What is your -- I don't know if my question is clear.
ELISE ROBINSON: So are we interested in kind of the within chunk or outside?
SEASTIEN JACQUEMONT: Or can you capture like the variation that you're seeing, which is average changes in expression within regions related to chunks of PRS; is that in part captured by what we already know about single eQTLs moving single genes, or is it just completely another mechanism?
ELISE ROBINSON: Yeah. For sure. If you look within a section, it's basically just statistically equivalent to a TWAS, but within one person instead of between two datasets, which I think could actually be helpful. But that should be all the stuff we normally look at.
I am not -- I guess we are developing a new analysis approach, but not really for the sake of doing so. I think we're just doing what has to be done to try to understand the mechanism of whatever we're observing here. And I want to understand how polygenicity can possibly mimic the rare variant effects in these regions, which I guess is highly contact dependent.
ARMIN RAZNAHAN: I have a question for Elise and one for Christa, maybe starting with you, Elise. Do you think the asymmetry in how much it modulates expression transcriptome-wide versus how much it is modulated for a particular gene, are you interested in that as a potential boosted kind of constraint metric and how do you think, if you had to guess, it would pit against pLI?
ELISE ROBINSON: Interesting. I don't know. I hadn't thought -- if you e-mail Ajay, I am sure he'd be really happy to hear from you. But that's interesting, because is it not just kind of reflective of something you can incorporate into a gene's constraint estimates?
I think it might require some kind of clarification of how this is exactly happening. But it's really interesting.
ARMIN RAZNAHAN: I'll drop an email, thanks. And then for Christa, I was wondering whether you think there could be unique predictive value from the -- I'm thinking about your shift happens paradigm. Have you looked into whether both how much someone is shifted and where they end up have unique predictive capacity for outcome? You could imagine it could be problematic having an IQ of 60 and that might have a relevance to your psychopathology load. But it might be a different consequence whether you got there from 100 versus you got there from 120.
CHRISTA LESE MARTIN: I don't think we looked at the data in that way. We didn't look at it to see if those were separable or not.
DONNA MCDONALD-MCGINN: Christa, I had a question about your cohort. So you have that very large group of patients with all kinds of things and then the CNVs that we're talking about today. Are there particular pockets of conditions within that, or are they mostly one-offs, unique to individual patients or families? What's that?
CHRISTA LESE MARTIN: It's a mishmash, I would say. We have been monitoring that to see are there ones that the sample sizes get large enough that we could put them into sort of more our targeted areas and go after them, but largely, because they're rare genetic diseases, a lot of them are smaller numbers. But there are a few that are starting to rise up to our more targeted CNVs and genes.
DONNA MCDONALD-MCGINN: And are most of them familial or de novo?
CHRISTA LESE MARTIN: Honestly, I think it's probably about half. We have the ones that are more severe presentations that tend to be de novo, and then 1q21 deletions for example, which tend to be more inherited. So I think it's sort of a mixture as well.
JENNIFER MULLE: I want to leapfrog off of Armin's question or just point something out. So you find, for example, with 16p that there's an average of two standard deviations, right? Of impact. And if you were to look at people or families on an individual level, I think it would be exciting to look at the outliers. So people who have the duplication or deletion and are substantially more than two standard deviations away from their parents or substantially less, and I wonder if that group would yield particular insights about like second hits or polygenic risk score or environmental influences that, you know, for risk or resilience.
We would love to do that in our population, but we just aren't quite big enough yet. But we have some people that are really profoundly quite affected or less affected than we think they should be, and I think that could be really fruitful to look at outliers.
CHRISTA LESE MARTIN: I think Cora saw that a little bit in 16. We had some outliers, like with SRS, that were way higher than we expected them to be. So yeah, we had some, in particular with the parents, the parents or siblings, we had some pretty remarkable outliers on the SRS. So we definitely see some evidence of that, and it would be interesting to delve in what could be contributing to that which is so outside of what one might expect.
SEBASTIEN JACQUEMONT: So, just to comment on that. If you take a trait that has a heritability of, for example, .8, in a few percent, so I think if you look at the distribution of the trait in the children around the parental mean, it varies normally with random segregation, and if you look at 100 families, for example, you will have a few families where just by random segregation you'll have a measure of the trait that will deviate over two standard deviations of what you measure in the general population, just by chance.
So the standard deviation of the variation of the trait in children around the parental mean is 70 percent of the population variation of that trait. So it's very close.
The standard deviation in children is actually quite close to the general population standard deviation, even though the heritability -- this is for heritability like height or even if you agree that IQ has a heritability of .7 or .6 or .8. So you have quite a bit of variation. So it works, yeah, in 60 percent of the cases, the trait is going to be around plus or minus one standard deviation around the parental mean.
But by chance, if you're recruiting 100 families, you're going to get -- so it's not only just new de novo mutations. So you will get probably new de novo mutations and things like that, but it's also -- so in the clinic also, it happens all the time, because your people are -- kids are referred and a lot of the kids that are referred are the ones that fall into that, it's not even extreme, every 50 or 100 families are going to get that really extreme deviation.
I mean, siblings look like each other, but not really in fact, if you really look at it. Do you look like your siblings? So I think that's, it's -- I think that's why we would love to replace that by all common variants to capture that. The effect that we're capturing now with parental mean. So just a reminder.
DONNA MCDONALD-MCGINN: So for the phenotypers in the room, there's a question from Lila Brebec(ph.), and it says: I'm wondering on the impact of the gold standard in the diagnosis profile of ASD and the impact on DSM changes for profiling the diagnosis in lieu of what has been connected in 16q duplication.
JONATHAN SEBAT: Is that question asking how does autism diagnostic criteria affect this study? Is that the main question? They're basically saying how does DSM or other autism diagnostic criteria affect us? I don't understand the question very well.
PARTICIPANT: Let me ask it more broadly. I'm not the question asker, but I think that some in the audience are listening to this and wondering about whether the diagnostic categories in the DSM and how we diagnose disorders will change as a result of the genetics first approach. So for example, your slides showing the -- I think it was Christa's slide with the family history and so many different disorders being related to common genetic variants, how will that affect how we classify mental health disorders in the future? Will we be switching to a diagnosis based on the genes, or will we still be diagnosing basic mostly on clinical symptoms?
JACOB VORSTMAN: Maybe I can say something about this, because one way we actually take this into account is that we don't only rely on categorical diagnosis, but we look at symptom domains and Raquel Gur's presentation that was presented. So for example, specifically for this question, we're not only saying, asking the question does the child, does the person meet the criteria for formal diagnosis of autism, we also ask the question does the person have symptoms of social and communicative impairment or repetitive behaviors.
Same for Christa's talk, where you could see whether a person may have a lower IQ, lower than expected, given parental IQ, but still not meet criteria for the categorical diagnosis of intellectual disability. So we're actually better off looking at those sort of changes in severity or dimensionally rather than relying on the diagnosis category.
DONNA MCDONALD-MCGINN: Thank you, Jacob. There is also an interesting question from Grace Stravinsky (ph.) asking if there's an association between schizophrenia and thyroid disease.
PARTICIPANT: Yes. There is. There is a well-replicated in the general population association between hypothyroidism and schizophrenia on epidemiological grounds and within 22q11.2 deletion syndrome there is also such an association. Lifetime diagnosis of hypothyroidism increases risk by about twofold, and we don't understand why. But that's a known association.
PARTICIPANT: I have a question for Elise about the results. So one thing that's interesting about 16p is we know it has that increased risk for autism because it's a recurrent -- it's one of the few, like the recurrence you see in these are really spotlights where we can see disease risk, because you find them in many people, unlike more random atypical CNVs which are sort of one-offs. But what I find interesting about your results is if I were going on a walk at night and I dropped my keys, the first place that I would check would be under the spotlights, because that's where I can see. But there could be plenty of reasons, other places in the dark, where I probably wouldn't be able to find them.
But when you did your statistical analysis of PGC and -- was it SPARC -- it hit one of the places where we can actually see and have inference, 16p, like one of the few recurrent regions. Like if that would have happened, if you would have looked at the convergence and it wouldn't have been a recurrent CNV but you found the same features, would you have looked at it in the same way? And sort of gone after it?
I just, isn't it kind of puzzling that your result happened in a place where we would actually know something about that region? But there could be tons of regions like that that have this same sort of effect.
ELISE ROBINSON: Yes. The 16p thing was just weird. That was very strange. But I guess I no longer have access to my slides, but in the PRS space, it's not like 16p and 22q are like out there and then everything else is that gray dot. There's a huge amount of variability in the extent to which you can see kind of extended expression effects of little chunks of common variation.
And in the analysis we did for the 16p paper that are probably stuck in the supplement somewhere, there is a distribution of average effects on expression in response to large sections of PRS. The struggle is the current sample sizes were not powered to understand that distribution. So we were powered basically to do, in the 16p case, one test and that was is the effect on expression on average across 16p in response to the autism polygenic score nonzero? And the answer was yes.
We were able to categorically say that it also happened to have the largest effect on average expression compared to every other region in a rank order way, but not even like you can't certainly like statistically compare that last bit to the second one, and the reality is while your question is of almost primary interest to me, our ability to ask it is going to be slow coming. So the aforementioned BICAN project, which is being led by our friends Evan Makosco and Steve McCarroll, is going to add another 200 postmortem brain samples to the already 120 that we were working with in Steve's original dataset.
But even that plus things like CommonMind, you're still well under 1,000, and we as a community just need to build massively bigger resources to start to understand these patterns. It's not unlike 15 years ago where we were stuck looking at single genes and you ended up with the candidate gene era because we lacked the ability to consider things genome-wide. That's kind of where we are in the expression space, simply because of resource limitations.
DONNA MCDONALD-MCGINN: Grace is asking if we can point her to the publications that talk about thyroid disease and schizophrenia, so maybe we can put them in the chat in a bit.
SEBASTIEN JACQUEMONT: I also have one question regarding the slide that you showed on Dan's paper where you do a concordance analysis between the genetic correlation of common variation and the genetic correlation across the gene burden test. So you use the Genebass summary stats to do this I think, and the question I had, it may be a little bit technical, but how many -- it's unclear how many genes are significant for each trait and when you do your genetic correlation, you have an estimate for every gene, for the gene burden, and do you use all of them? Do you shrink a little bit for the ones that are really not significant at all? How much noise is -- because you show that there's on average almost a twofold increase in genetic correlation when you compare the rare to common. So is that related to any differences in statistical methods between the common and the rare variants, or what is your take on this?
PARTICIPANT: Now that you've asked, I am also curious, and I'm going to take a peek and get back to you.
JONATHAN SEBAT: We did start the talks about 10 minutes early. I think Geetha and Josh's presentations ended early. So if there aren't further questions for this session, I think we can go ahead and complete the Q&A and then start our coffee break and then we'll come back for the affiliate presentations.
What time should we come back for the presentations? Oh, sorry. Should we go ahead and do it? Because there was supposed to be a coffee break and then David.
PARTICIPANT: Let's take a break a little early and then we swapped the order of talks. So David will give his talk after the break. Thank you, everyone, for your wonderful presentations so far today. Looking forward to having more discussions over break, and we'll see you in a bit.
We should try to actually keep to the schedule rather than starting early.
Large-Scale Evaluation of the Effect of Rare Genetic Variants on Psychiatric Symptoms and Cognitive Ability
JONATHAN SEBAT: We are back. I am delighted to be able to introduce David Glahn, chief of research, director of psychiatry, at Boston Children's Hospital. We've heard from Projects 2, 3, and 4, but we haven't heard from Project 1. David is the contact PI for Project 1. Take it away, David.
DAVID GLAHN: Okay, thank you, guys. I apologize for being a little bit late this morning. Today I'm going to be talking about Project 1. I don't have any conflicts of interest.
As everybody this morning has been talking about copy number variants and variations within, our project is a little bit different, because we don't focus on any particular copy number variant or on any particular illness. Rather, what we're trying to do is understand how copy number variants work in very large populations, some of which are required for disease, and some of which are not.
Specifically, when we set down to think about what kind of project we could do, we were fascinated by this problem that many of the CNVs that we see in clinic aren't ones of the type that tend to be studied in groups. So we wanted to understand and think about those CNVs. Was there any way that we could think about how to model?
In addition, there are always questions about whether or not the full phenotypic spectrum is known for some of these CNVs, and questions about why it seems like so many CNVs overlap with so many different psychiatric illnesses.
From that we created a project which we subsequently renamed CAMP, or CNVs and Major Psychopathology. In this project, we have four specific aims. One is to acquire a very large cohort of archival data and then to harmonize those phenotypes. The second is then to characterize a set of what I'm going to refer to as recurrent CNVs in large general populations.
The third was then to examine the contribution of common variants via polygenic risk scores and CNVs in humans. And then the final is to model the effect sizes of rare CNVs on dimensional phenotypes.
What I'm going to do this morning is to talk a little bit about our progress in each of these four aims. I'll note that we've been fairly productive to date. We have 22 papers published thus far from this project. And another five or six that are kind of in route.
Thinking about the first aim, the cohorts and harmonization. When we started with this project, we identified a certain set of cohorts, got us to about 700,000 people. We received a supplement and were able to search for more cohorts. In total, we identified 59 cohorts with about 2 million people in them. We had to exclude 28 of those, mostly because they didn't have adequate genetic data, in general. And/or they were just too difficult for people from the United States to be able to get their hands on.
From that we identified 31 cohorts, and that puts us at about a million and a half people. Here's a list of the current 30 cohorts. There's one that's missing in this list, but here's a list of the current 30 current cohorts we've got, their sample sizes, and whether or not they have cognitive data, symptom data, or diagnostic data.
What you see is that it's a very large group and some of these groups have significant amounts of data associated with them and some of them have less. And there are all kinds of issues about exactly how data were acquired, what the requirements were for selecting them.
We've done lots of charts like this, which is giving you a set of these different cohorts we have, and seeking questions about the age of those cohorts, or their racial structure, and exactly how they were ascertained. And we've also done a good deal of work trying to do phenotypic harmonization.
Phenotypic harmonization turns out to be exceptionally difficult or very easy, depending on how you look at it. Some days I'm really happy and it's very easy, and other days I'm pretty depressed.
Here's some work that Emma Knowles at Boston Children's was able to do, and this is actually harmonization within the same datasets, looking at UK Biobank data. What we did was try to estimate intelligence for the largest group possible, so a general g factor. But if you do that using strict criteria and one timepoint, you get about 160,000 individuals that you can have g on. If, however, you assume that there is temporal invariance, which is kind of one of the basic assumptions in creating a g, then you can pull measurements from different timepoints and you can up your numbers to 257,000.
So we've been doing stuff like that, lots of psychometrics, and questions that come along. But in doing so, we've been able to aggregate across datasets, and within datasets, as this is an example.
In addition, Sebastien and his team of merry men and women in Montreal have been calling CNVs on very large sets of individuals, and so at this point we've called CNVs on many of these cohorts. Not necessarily all of them, because sometimes it's still kind of data and acquisition. Those data are coming along as well.
Moving on to aim two, we'll be thinking about characterizing recurrent CNVs. When I'm talking about recurrent CNVs, I'm thinking about kind of the classic type of CNV that we can consider, like 22q11 or 16p, where you have low-copy repeats and you have a change in the DNA between those two low-copy repeats. So you're thinking about these types of CNVs that have what they used to refer to as genomic hotspots around them.
So you tend to see these recurrent, these CNVs happen more frequently. They happen more frequently in the population. So we're using the term recurrent here to kind of think about it after different papers that have come out. Others might take umbrage with that term, because they want to think about recurrent meaning recurring within a family versus being de novo within that family. We're thinking specifically about recurrent within a larger population.
We were able to recently write a manuscript looking at the contributions of copy number variants on psychiatric symptoms and cognitive ability, and the reason that I'm bringing this up here is because what we did specifically was try to go through the literature and see where we saw the most evidence for individual CNVs, and where those CNVs kind of overlapped between groups and where you had evidence across different locations.
Here, the reason that this is interesting is not because any of this is particularly novel. It's rather we spent a lot of time trying to figure out how to present it as a singular figure of where do you have it for neurodevelopmental disorders, autism, ADHD, schizophrenia, and so on?
Doing this also kind of helped us think about which group of CNVs we could really think about in this kind of theme. We then took some of those same CNV, that same list of CNVs, and went in and looked at depression symptomatology. Here we took data from the UK Biobank, we estimated a depression index based on items that were responded to in the psychiatric interview, we came up with a depression and anxiety scale, we showed that that highly correlated with the diagnostic data but was quantitative rather than qualitative.
We then applied that and looked for differences associated with CNVs. What we were able to show is that by and large, we got effects for CNVs across different measures, we found that in general we saw bigger effects for deletions than we did for duplications, and then we went out and modeled individual CNVs with this lower graph. We'll be revisiting this particular paper as we continue along.
We've been doing the identical work now with the IQ measure that I was just mentioning a moment ago, the g measure. And again, you kind of see similar sorts of effects. In this case, we're looking at no CNVs, a list of those recurrent CNVs that we're referring to as tolerant, meaning they don't have a gene that has a LOEUF score that would make you think it was intolerant, it was dosage sensitive. And then those that are intolerant.
And again, not surprisingly, you see a bigger effect for deletions than you do for duplications, and we are getting effects in this general population.
Here is a graph that shows everyone's favorite CNVs, kind of what we're seeing within this cohort. I will note that we are explaining a reasonably large amount of variants of gene using this kind of approach. So we're fairly convinced it's working at least at this level of being able to say, yes, we can see population-level effects. I would also note that we replicated this particular set of findings in another cohort; however, for various technical reasons I didn't want to show that to you yet, but that's what this manuscript is meaning.
Then we've also been able to do other things, which is ask whether or not we can see effects of CNVs, of the prevalence of CNVs by ancestral group, and we've talked about this particular finding in this group of scientists before, and the answer is, yes, you can see these effects. But we've spent a lot of time modeling and trying to figure out exactly what the appropriate comparison group should be.
So you can see it when you compare it against the entire group of -- in this case, we're comparing individuals of African ancestry with white British or individuals of South African ancestry with white British, but then we also constrained it down to match the individuals from the outgroup to the white British group, which is by far the largest group in this sample, based on demographic data and on socioeconomic data and on ability to actually access healthcare.
Again, this is another finding that we've actually replicated in an independent sample, and that that's moving us forward and has been the reason that we've kind of delayed publishing, because we wanted a second sample to find it in. What's striking about the second sample, which is this marked sample, is we get almost identical effects when you remove the autism probands.
Moving on to aim three. Now we're looking at CNVs and common variants. Here we're using polygenic risk scores. I understand that Dr. Robinson just discussed polygenic risk scores, I won't discuss how they're created. Rather, I will point out that there is some evidence out there that polygenic scores help us explain variable expressivity. That is quite useful, and particularly useful when you're looking at specific mutations and trying to understand that variable expressivity.
I will also note that we've been doing work as part of this project looking at the stability of polygenic risk scores. We published a paper last year showing that depending on which discovery sample you use, you have a lot of variation of how well you can and can't replicate your effect. In this particular figure, what we're showing is a polygenic risk score derived from the UK Biobank, utilized randomly selecting subsets of individuals in order to be able to create your discovery sample, and then testing that discovery sample on the rest of the sample. And asking how predictive is it, how predictive isn't it? If you have overlap between your discovery sample and your test sample, do you show effects, how does that look?
Here is a paper we are doing the same thing, but now pointed much more toward psychiatry. What we're showing here are four different polygenic risk scores for major depression, schizophrenia, alcohol use disorders, and type 2 diabetes, where we've used two different discovery samples and then we asked what's the correlation for an individual of their risk in a third out-sample.
The third sample is never used in the discovery samples. And what we want to know is how much overlap is there? What you can see is the correlation for the major depression polygenic score is only .37. It's higher, .78, for schizophrenia, but we'll note that most of the people that were in the first discovery sample for schizophrenia were also in the second discovery sample for schizophrenia, so there are questions just about the stability of the score.
Why this becomes important is that if the polygenic score that you're generating is particularly dependent upon the discovery sample, you could tell someone that they are at high risk for an illness from one discovery sample, but not at high risk for an illness if you use a different discovery sample. That asks and begs questions about the clinical utility of these types of audits. So we're very interested in this. This manuscript is currently under review. If anyone here happens to be a reviewer, please be kind.
Ignoring that, doing what the field tends to do, we then looked at that depression measure that I discussed before and the UK Biobank, and modeled the relationship between the depression measure and CNVs in aggregate and then CNVs individually, asking specifically do we see an interaction.
The answer that we've seen thus far is no. We didn't see an interaction at the aggregate level, so when we're looking at recurrent or intolerant or larger categories within, and then when we looked at individual CNVs, every now and again you see something, but the P value is relatively small. In the study that we've done, we can't say it's significant because we tested a lot of different models, and so by correcting for multiple comparisons, they kind of wash out.
We have a similar story when you're looking at interactions with IQ measures, where we're not able to show this. These again are the larger categories that we're looking at, we're looking at different PRSs. The phenotype is IQ, but we're looking at PRSs for schizophrenia or for autism or for IQ. I don't have the figure here for the individual genes. You get a little bit more interaction with this, but not much more. And again, it's the same sort of issue, which is that when you control for all the comparisons we're doing, it washes out.
At least at this very broad level, it's harder for us to show an interaction effect. That doesn't mean that you're not explaining variation in the presentation of the phenotype. It just means that it's additive, not an interactive effect.
Then aim four, we get to our nonrecurrent CNVs. Here we're thinking about not those CNVs that have relatively similar breakpoints, but CNVs that either are in the same locus and have very different breakpoints or are just randomly spread throughout the genome. And this was really one of the impetuses for our project, because we were thinking about ways of trying to model these CNVs.
And as a matter of fact, based on work that was done by Sebastien and his colleagues, we came up with a way of going into each CNV, each gene within each CNV, and then modeling and then estimating based on some index the likelihood that that gene is particularly vulnerable to having either one copy or three copies, whether it's a deletion or a duplication. So all we're doing is calculating in this case the LOEUF score for each of those genes, then averaging them together for deletions and duplications.
This is some ways similar to just creating a genome-wide burden model for an individual, but it's based not only on the amount of the genome that been disrupted, but also on how important we think the genes are in that region.
So we've used this technique in a couple of ways. Guillaume, who is somewhere in the audience, has utilized this in a relatively large sample of individuals to look for genes that seem to influence IQ. And this method has worked relatively well to be able to do that, to be able to go in and estimate the number of genes that seem to be important. If CNVs when you predicate(?) them on, in this case pLI index, whether or not you can show a direct relationship between that aggregate pLI score and intelligence. And indeed, we were able to show that, and that that effect seems to be greater for deletions than for duplications.
We've done similar types of work in other samples, and specifically here we're now looking at some work that Aaron Alexander-Bloch did, who was in my group but now is at Penn, where all good people seem to go. And he was able to estimate different -- he was able to call CNVs in the Philadelphia neurodevelopmental cohort, and to be able to show effects.
I particularly like this slide as kind of the conclusion to this talk. One, because it utilizes many of the same techniques that we have been developing in CAMP and that we did in close collaboration, but also because he gives us different estimates. So we're looking at the overall effect of CNVs, which you can see in these black circles, but then also gives effects of polygenic risk and of trauma and of environment. So what that allows you to do is think about the effect of a CNV, its aggregate effect, on the likelihood that an individual is going to have a psychiatric diagnosis, in this case a child.
To me, that's something, a direction, that we should be thinking about, which is, understanding the mutation that we're interested in, but also understanding that that mutation is in the context of the life of the child or the person, and the likelihood that that's going to be -- and how predictive that that's going to ultimately be, in whether or not they manifest the illnesses we're interested in.
From there, I would just like to thank all of my colleagues. I believe I'm about five minutes early, which means I rushed through this, and I appreciate everyone and hope that -- and I'm happy for questions.
JONATHAN SEBAT: There will be a Q&A after the next three talks, so we'll kind of bundle everybody together.
I'm pleased to introduce from Rutgers University Jennifer Mulle. She is one of the affiliate members of the G2MH, because of course it started with a few parent grants focused on specific CNVs, but of course we really need to understand the wider variety of rare variant disorders and how these different rare variants affect psychiatric traits, and so it's totally logical to start bringing other groups into the fold.
The 3q29 Deletion and Duplication: Adventures in Remote Phenotyping
JENNIFER MULLE: Thank you. Today, I am going to talk about the schizophrenia-associated 3q29 deletion and duplication and the efforts that we have been working on. But before we do so, I wanted to just -- I wanted to maybe take just a step back and speak to the logic that underlies this consortium. Often when I say that I work on a rare disease people immediately marginalize me and they say, well, this isn't relevant to -- you're working on something rare, that's not going to be important. I've actually gotten -- those exact words have been in some grant reviews, so that's not great.
But actually, a long history of research in human disorders really belies that assumption, and there's even a patient advocacy group in the UK that have proposed that rather than call these rare diseases, we call them fundamental diseases, to reflect how relevant and important they've been to understanding human physiology, and I'm just going to provide you with a few examples.
This work is like a century old, more than a century old, so it's easy to forget that so much of what we know about human metabolism is because of rare variants. A baby would show up at a pediatrician's office, that baby would have dangerously high level of some metabolite, and through biochemical and later genetic means, it would become clear that that baby was missing the activity of enzyme, and so then one could infer that that metabolite is the substrate for the enzyme, and that's a rate-limiting step in a given pathway. You probably can't even calculate how many metabolic pathways have been worked out through this, and medical students know this well because we haze them by making them memorize all of those pathways.
But here's an example where studying a rare variant has transformed our understanding. Even more relevant, we could talk about Brown and Goldstein, who are interested in cholesterol metabolism. They studied the extremely rare form of familial hypercholesterolemia, literally one in a million, which led to insights about cholesterol that were so transformative that not only did it lead to a Nobel Prize for Brown and Goldstein, but it led directly to statins, which my own doctor describes a miracle of modern medicine. So here studying something extremely rare that led to the development of a therapy that has profound impact in public health.
We could talk about cancer biology and Li-Fraumeni syndrome, which is caused by inherited mutations in p53, and we now know that p53 is somatically mutated in more than half of human cancers, and in fact a lot of the terminology that we use to describe cancer, like proto-oncogene, tumor suppressor gene, really arose from these early studies of cancer families.
And like William Bateson, again, more than a century ago, coined the famous phrase treasure your exceptions. The exceptions can often illuminate the rule. So when we think about fundamental diseases we haven't really been in a position until very recently where we have fundamental variants or rare variants that are associated with severe psychiatric illness. We've talked about this graph previously, and so I won't belabor it, only to point out that the 3q29 deletion is associated with the highest risk for schizophrenia of any known variant, so this is where we live, in the penthouse.
And just a brief history of the 3q29 deletion -- it was really first described, there's a case report in a single individual, and it was described as a pediatric syndrome. A few years later, there was a case series, six cases were described, again in pediatric patients. In 2008, there were a larger case series of the deletion, and the reciprocal duplication was also implicated as pathogenic.
In 2010, we joined the fray by showing that the 3q29 deletion, we were the first to show that it was associated with schizophrenia, and we were very relieved then when multiple independent groups also found evidence that the 3q29 deletion was associated with schizophrenia.
And then really in 2017 definitive evidence came from the PGC, the heroic efforts of Christian Marshall in particular, looking specifically for copy number variants associated with schizophrenia, and found that the 3q29 deletion was one of only eight variants with genome-wide significance.
This is the 3q29 deletion. On top we have the schematic of human chromosome 3.That highlighted area is where the deletion is. If you blow up that interval, there are 21 genes that are impacted by the deletion.It's 1.6 megabases, and about 95 percent of the study participants that we see have the same 1.6 megabase deletion, which is mediated by those low copy repeats.
So there's something about having one copy of one or more of these 21 genes that renders somebody uniquely susceptible for schizophrenia, and I think about these 21 genes when I wake up in the morning, when I go to sleep at night, I dream about them. So I think about this all the time.
I should point out that this is an extremely rare syndrome. It's present in about 1 in 30,000 in the general population, so it's an order of magnitude less frequent than, for example, the 22q deletion.
I've been invested in this for a long time. It took some time to form relationships with the community, to amass a critical sample size. In 2017, we received our first significant funding for deep phenotyping of our study subjects. We were able, through that work, to describe an overview of the syndrome and to recommend best practices for clinical care.
We've been able to describe the nuances of the cognitive profile, the shape of social disability in 3q29 deletion syndrome, we've learned that there are adaptive behavior deficits, which we've reported on. There are symptoms of pediatric feeding disorders, which is also common among other CNV, and we think that's a very interesting piece of the puzzle. There are substantial -- there's visual-motor integration deficits.
In 2018, we got funding to expand our studies into neuroimaging, and we learned that there are really substantial deviations of the posterior fossa and cerebellum in particular, so substantial that you can actually see them -- I'm not a neuroimager, I'm lucky to have colleagues -- but when you look at a scan it becomes really apparent that the cerebellum is quite profoundly affected, and it turns out the degree of cerebellar volumetric reduction seems to be related to the amount of subthreshold psychosis syndromes that these individuals exhibit, suggesting that it may be relevant to mechanism.
We've even been able to describe caregivers' perspectives, how families feel when they get a diagnosis and what the stakeholder concerns are. We're also very interested in understanding mechanism of 3q29 deletion, and when the pandemic hit we used our time wisely by conducting in silico analyses to try to understand some of the impacts of the genes in the interval, and most recently, this paper came out less than a month ago in Science Advances, we've been able to identify cellular phenotype and specifically mitochondrial dysregulation that's associated with 3q29 deletion.
So we are deeply, extremely proud of this work, and it's a great start, but there's more work to do. There's always more work to do. This wonderful project, we feel grateful to have done it, but we evaluated -- oh, sure.
This is a true story. When I gave my dissertation defense a million years ago, my mom sat me down and she said, Jen, you tend to talk really quickly. What I'm going to do is I'm going to tug on my ear when you're talking too fast, and that'll be an indicator to slow down. And by the end of my dissertation talk, my mother was bleeding because she had tugged on her ear so hard. And Jenny just approached me and asked me to slow down because it turns out that the closed captioner can't catch up. So I'm going to try. I'm so excited -- that's why I'm talking quickly.
We're really proud of the work that we had done previously. But it's a relatively small sample size. With 32 individuals we can probably explain some general, the most common manifestations, but we haven't really documented the full range of phenotypic variants. There may be some things that are less common in 3q29 deletion syndrome that we probably haven't identified.
We want to understand heterogeneity. Some people with 3q29 deletion will go on to develop schizophrenia, but some won't. Why is that? And then outcomes in general. There are some people with 3q29 deletions who may live independently, who may have jobs, who may get married, and others who will need lifelong care, and understanding some of the factors that may predict that would be really useful in many ways.
The hard part of this work, when you study a rare disorder, is amassing a critical sample to do studies. We have a registry for 3q29 deletion syndrome that has been active for a very long time, and we've collected over 200 individuals. So we already have done some of the hard work of ascertaining individuals, and in addition to the deletion, we've also ascertained more than 50 people with duplication. So the question is how do we use all of this stuff that we've collected and expand our understanding of the phenotype, and how do we do this efficiently?
Of course, the pandemic hit, and we, like others, were thinking about telehealth and other remote ways of assessing phenotypes, and we wondered if we could use these strategies to do data collection at a distance. There are some things you can't do, there are many things you can do. You can certainly have people fill out questionnaires. You can do interviews via Zoom or other telehealth applications. And there even are some ways that you can do direct evaluation.
We reasoned that if we were able to transition to a remote phenotyping paradigm, we could also collect data on parents. We could extend our studies to the 3q29 duplication, which is exceptionally understudied, and our instrument could be harmonized with other studies, like G2MH participants, which would facilitate cross-disorder comparison.
So we piloted our remote phenotyping protocol. We felt really good about it. And in 2022 we got funding. And the goals of our project were to enroll 200 individuals with the 3q29 deletion and collect phenotypes from them as well as their parents. We'd like to understand the 3q29 duplication, and we've recently come to the understanding that we probably need some controls, and I'm going to show you some data that supports that.
The groups in G2MH and specifically project 2 and project 3 have been incredibly gracious in sharing their paradigms with us, and so we have been able to use not only harmonized data collection instruments, but paradigms to implement those instruments in the same way as other groups, to facilitate cross-disorder comparison. And of course, even at a distance we can collect DNA. We can collect a saliva sample, and with that DNA we can ask how polygenic background and second site mutations may contribute to phenotypic variance.
So these are sort of the domains that we prioritized as sort of our wish list for what we'd like to assess. These are the instruments that we're using, proband, and we're conducting a more limited phenotyping battery in the parents, so we're very excited to have this data to start to describe 3q29 deletion syndrome.
We just got started. We launched a few months ago, and as of last month we've enrolled 21 families with a 3q29 deletion, two families with a 3q29 duplication, and handful of controls, and what I'd like to do is show you some of our data.
This is social disability as measured with the Social Responsiveness Scale, or the SRS. This is scored on a T-score scale, so the higher the score, the more disability there is. And you can see that our deletion probands really have quite a bit of social disability as indicated by high scores. What's interesting about this graph is that the parents themselves also have some higher scores on the SRS than we might have predicted, and this is part of what inspired us to collect some control data.
On the next slide, I have executive function, and executive function is measured with the BRIEF. This is where scores get clinically significant. You can see that our controls score right at or below the population mean, which is 50.
Our deletion probands score higher, and there's a much wider range of variance. What's interesting to us is that our parents score a little higher than controls, and we wonder if there's an ascertainment issue here. We wonder if we are selecting deletion probands that are most severely affected, and in doing so, we're sort of also selecting parents who are severely affected. I think there's a lot of really interesting things here to unpack. I see some other people smiling, because I think that you have also seen a phenomenon like that.
And then we have one duplication proband which seems to have an elevated executive function score.
Another interesting thing about this is that on our prior study we found that 46 percent of our participants had clinically significant executive function deficits, and while it's early days yet, we so far only see about half of that. Some of the data that we've collected right here, our participants had participated previously and we have prior BRIEF scores on them. And there's an exciting possibility here, which is that executive function in 3q29 deletion syndrome may improve with age.
It's early days, there's lots to talk about, but that does make sense, right. People with 3q29 deletion syndrome are on a neurodevelopmental trajectory. It's just a little bit slower, and so it may be that they are acquiring skills, just at a rate that is more slow than in the general population.
We have a really lovely RA named Matt Harner who is particularly interested in medication history, and so he developed a very extensive -- it was really hard to program this medication questionnaire -- but the way it's programmed, it makes it exceptionally easy for parents to deliver information about their medication history.
Of 14 3q29 deletion probands that we have analyzed so far, 86 percent of them are medicated at some point or another with some kind of psychiatric medication.
This next graph shows the age at which these kids received their first medication, and you are seeing correctly. There are two children who at age 3 were diagnosed, in this case they were both diagnosed antidepressants. We have one child diagnosed at age 5. We have six kids who were diagnosed between the ages of 7 and 8. And I can't help but imagine what that family's experience was like and what the behavior of that three-year-old was like that they were really at such a place that they felt that medication was their last best option, and that there was a physician that agreed to that. I'm only saying that because I think it seems important to understand the experience of these families and perhaps to provide some guidance for them, and to maybe understand what the experience of these kids is on medication, because this is clearly off-label use.
And then kids with 3q29 deletion syndrome, very few have -- they go anywhere from zero to five medication trials, and these kids on average are age 16. So clearly parents and doctors are turning to medication, almost as a last resort, to help with the behavioral manifestations.
I started this work in 2010, and we understood that the 3q29 deletion was associated with neurodevelopmental disability, sort of in pediatric times, and then we understood that there was an increased risk for schizophrenia as these kids age. And here's what we understand in 2023. I'm super proud of the work that we've done. There's so much more to do, but I feel like we’ve really moved the needle on understanding this syndrome.
In conclusion, systematic deep phenotyping is critical to reveal the neurodevelopment and psychiatric morbidity in all CNV disorders, but 3q29 deletion syndrome in particular.
Remote phenotyping is a very viable paradigm for data collection. And because we have these harmonized instruments, we will be able to do cross-disorder comparison.
In future directions, I think longitudinal data collection is imperative to identify vulnerabilities and strengths across the lifespan. And data from deep phenotyping can often shed light on mechanistic hypotheses, as well.
I just want to acknowledge my awesome team. People who are involved in the remote phenotyping study include Mike Epstein, our MPI on this project, who is going to guide us through the genomic analyses. Terry Irving is our project manager, she really keeps the wheels rolling and keeps us all organized. Matt, Dani, and John are people with boots on the ground evaluating kids with 3q29 deletion. And they've been trained, in fact Raquel knows their names probably because they have been trained very copiously on the Penn CNB. And then we have two psychiatrists, Dr. Deo is a child psychiatrist, and Dr. Pato is an adult psychiatrist, in case our study subjects need help.
Thank you so much for listening, I'll be happy to answer questions.
JONATHAN SEBAT: Our Q&A will be after two more speakers. We have the good pleasure of introducing Armin Raznahan. Sex chromosome aneuploidies -- XXX, XXY, XYY, are incredibly common in the population. If you bother to ask your grad student to look at the aneuploidy in the data set, they're actually really common, and they're associated with a variety of psychiatric traits, so it's actually something really fascinating to study. So I'm delighted to hear from Armin.
Sex Chromosome Dosage Effects on Human Brain and Behavior
ARMIN RAZNAHAN: Thanks, Jonathan, and thank you all. Really exciting to be here in person with everyone and have the chance to hear your incredible work and share some of ours.
I'm here to talk about sex chromosome dosage effects on human brain and behavior, and our group is based just down the road at the NIMH Intramural Program Bethesda campus, with a section on developmental neurogenomics, and we're dedicated to trying to better understand the biology of neurodevelopmental disorders, or NDDs, in ways that might ultimately help to improve disease prediction, detection, and treatment.
We approach this overarching goal through three interconnected research themes, and theme one is the one that I'm going to focus on today because it's the most relevant for the G2MH mission, and that's trying to parse the clinical and biological complexity of common neurodevelopmental disorder presentations, like autism and ADHD, by deep phenotyping rare genetic conditions that substantially increase risk for these common outcomes. And we've considered many different genetic disorders in this work, but as Jonathan was saying, we focus in particular on sex chromosome aneuploidies.
Why do we do this? They have their problems, but I think there are many really exciting reasons to motivate and encourage using sex chromosome aneuploidies as genetic first models in psychiatry and as conditions of clinical importance in their own right.
I think what makes them especially attractive to study or valuable to study, as Jonathan said, they're collectively common, affecting 1 in 400 individuals, and this facilitates gathering the large sample sizes you need to get a handle on variable penetrants and ascertainment bias effects. They have these diverse carrier types, and we know that they increase risk for various neuropsychiatric outcomes, with some evidence of dissociability depending on whether you're talking about a modulated X or Y dose.
Most affected individuals can tolerate deep phenotyping, and we've been seeing many examples of how important deep phenotypic data are, and unlike copy number variations, which come in 1, 2, or 3 doses, you can get these incredibly wide dosage ranges in sex chromosome aneuploidies, which enable you to use parametric analytic approaches to get a better statistical handle on direct gene dosage effects.
They also benefit from some pretty attractive murine models of X and Y dosage variation, and the sex chromosomes tap into this broader topic of the biology of sex differences, and we all know sex is such a potent modifier of neuropsychiatric outcomes.
So there's a long history of studying sex chromosome aneuploidies in the intramural program, and the most recent phase involves families coming to the NIH for two days, probands undergo over 80 person-hours' worth of assessment during this two-day visit, and we gather phenotypes spanning this range that I'm showing in the image here, and I'm giving some examples of the phenotypes we measure on the right.
This is, as we've been hearing from many in the group, difficult work to do, but I think this comparative deep phenotyping of distinct genetic groups is really the only way of filling out this phenotype-by-genotype matrix, which is a necessary starting point to trying to empirically retrace the pathways that connect the gene dosage variation to the surface NDD phenotypes we care about, and in particular you need a relatively full phenotype-by-genotype matrix to have sensible chance of searching for these potential points of convergence -- which pathways overlap and which pathways are separate from each other, and that's obviously of great theoretical and practical importance in psychiatry.
Today's talk is going to give you a taste of some work within this framework. I'm going to start off telling you a bit about studies of behavior and clinical outcomes, and then I'll finish with telling you about some of the X and Y chromosome dosage effects we've been studying on brain organization, and this work will primarily concern neuroimaging phenotypes, but I'll also give you a hint on some work we're trying to do to decode some of those imaging changes, to nominate candidate molecular-cellular pathways that might lead to them.
Starting with the clinical outcomes. A sensible first question is just how penetrant are sex chromosome aneuploidies for common-garden psychiatric diagnoses? I think probably the best available evidence for this comes from the iPSYCH consortium. These are colleagues Thomas Werge and his team, who combined population registries with neonatal blood banks to try and get relatively unbiased estimates of the penetrance of different gene dosage disorders for psychiatric diagnoses.
I'm going to walk you through how this works. Here, each of these rows is a different sex chromosome aneuploidy group. And the vertical solid line is the reference. So the further to the right you are of that line, the greater the hazard ratio of that sex chromosome aneuploidy for attention deficit hyperactivity disorder. And I'm showing you the hazard ratios on the right-hand side, and I'm giving us some benchmark lines from other studies that relate to the iPSYCH estimates of penetrance for 22q11 deletion and 16p11 deletion.
You can see that many of the penetrance estimates are really quite pronounced for these sex chromosome aneuploidies for ADHD, and these are some of the graphs for other diagnostic outcomes. So I think this shows that sex chromosome aneuploidies are penetrant for psychiatric diagnoses in a population-based study design and that the magnitude of penetrance you see is in the same ballpark, if not sometimes higher, than some of the more classically studied CNVs that immediately come to mind, and I think this is really important because the community of individuals and families affected by sex chromosome aneuploidy really feel that there's a lack of public awareness of some of the difficulties that their children can have. So I think it's important to kind of shine a light on some of the psychiatric morbidity that can imparted by these conditions.
But these are diagnostic data, and there's really evidence of substantial subdiagnostic threshold burden of symptoms. I'm showing you a taste of this from our XYY cohort. The rows are diagnostic outcomes, and the columns are continuous measures of psychopathology across different scales, as measured by the CBCL. The color tells you what proportion of people who don't have that diagnosis have a score on that scale that's in the clinical-to-borderline range.
So this is telling you even in people who lack a given diagnostic category, there's substantial clinically relevant subthreshold. So this will motivate looking at dimensional measures of psychopathology, and our field has really just been grappling with this inherent tension between the study designs that you need to minimize ascertainment bias as compared to the study designs that you need to maximize your ability to get dense homogenous phenotypic data on multiple groups.
I don't think there's any right position on this spectrum. I just think you have to know where you are and what the pluses and minuses are of the position your study design adopts. Obviously, the optimum approach is having complementary studies at different points in this compromise.
We very much work in the space of smaller groups that come with some ascertainment bias, but that you can get super-dense phenotypic data on. These are some of the things that you can do when you're working in that space. These are data from 64 individuals with XYY syndrome, and I'm showing you the distribution of scores that these individuals show on diverse, over 66 dimensional measures of psychopathology, representing the diverse domains of clinical relevance. And you can systematically rank these domains to do a fine-map behavioral phenotype of XYY. What is hit most and what is hit least?
You can see that domains of social impairment are the most prominently impacted, but in contrast, domains of proactive aggression are really hardly impacted at all, and that's super important, because there's this stigmatizing notion of XYY as being a condition that promotes proactive aggression.
Interestingly, this gradient of impact is .9 correlated between a high ascertainment bias subgroup and a low ascertainment bias subgroup. Whilst the point estimates of penetrance are likely to be somewhat inflated in this type of study design, the statement about the relative impact across domains seems to be pretty well preserved.
There are some other cool things you can do when you have this high-density phenotypic data. One thing is identify axes of symptom covariation within your gene dosage disorder of interest. So this is a matrix showing the correlation between all 66 scales within 64 people with an extra Y chromosome. And what this says is that you can take these 66 scales and compress them into 10 partially dissociable axes of clinical variation, which you can represent as a network.
So the hub of the psychopathology network in XYY is to overall load in inattention symptoms. And when you get this compressed view, you can now ask how do these different dominant domains of clinical variation relate to other phenotypes I care about? And you see some interesting dissociations.
The overall load of externalizing symptoms is pretty strongly correlated with caregiver strain, whereas the overall load of early social impairments doesn't really as strongly hit caregiver strain but is a pretty strong predictor of impairments in adaptive function. And I think this is useful for refining lightweight, light-touch ways of measuring the clinical constellation of presentation, and also zeroing in on which axes matter for the real-world outcomes we care about. And that will vary depending on your context.
When you have these dimensional data, you can also get quite refined estimates of how similar are two different gene dosage disorders in terms of their behavioral phenotype? I'm showing you that here. Each of these is a separable continuous measure of psychopathology, and I'm plotting the mean effect size on that domain when you add a Y chromosome to a male, against the mean effect size on that domain when you add an X.
What you see is the correlation is about .8, so that means that if you know what's impacted worst in XYY, it's pretty likely to be impacted worse in XXY. But you'll note that this fit line is off the identity line. That means most things tend to be more strongly impacted by an extra Y than an extra X, but there are interesting exceptions.
Measures of social impairment are prominently sensitive to a Y relative to an X, but strikingly, measures of difficulties with internalizing symptoms -- for those, the X actually seems to be equipotent to the Y, which is a surprising deviation from this overall pattern of the Y generally being kind of worse overall.
So when we saw this congruence between the effect of the X and Y across multiple dimensional measures of behavior, it naturally raises the question, well, how congruent are the X and Y effects on brain? I'm going to tell you about that next.
The approach we took here was to ask this question of brain vulnerability using many phenotypes, many measures of the brain at once, and also including another common aneuploidy in humans, Down syndrome, as sort of control aneuploidy.
This was work led by Lisa Levitis. We have our three case control cohorts with neuroimaging data, one capturing the effects of adding an X, one capturing the effects of adding a Y, and on capturing the effects of adding chromosome 21. And in each of these conditions, we measured cortical change using 15 different measures of cortical structure and function, from three different neuroimaging modalities.
We derived these effect sizes for each of 300 brain regions, and what that means is that you can represent brain vulnerability to each aneuploidy as a rich matrix of 15 different measures of cortical change for each of 300 brain regions.
When you have this sort of data, you can start to look for organizing principles that govern the pattern of regional brain vulnerability in humans to aneuploidy. But I cut that slide for reasons of time, so I'll take you to the next output of this analysis, which is how can you compress this information?
We carried out principal component analysis in each of the matrices, and what we found when we looked at these principal components is that each aneuploidy hits your brain with a totally different fingerprint. The overall constellation of effects are very different between the aneuploidies. But the first principal component of each matrix is positively correlated between the aneuploidies.
So what I'm showing you here, these are regional scores for the first principal component of multimodal brain change when you add an X, when you add a Y, and when you add a 21. And these, you can see that the regional scores are pretty highly correlated between XXY and XYY. So just as the profile of behavioral change was highly congruent, the profile of brain change seems to be highly congruent between the X and the Y, and the feature loadings on that PC are also highly congruent.
So we're seeing this recurrent signature of strikingly convergent effects of an X and a Y on human behavior and human brain organization, which is kind of wild when you think about how different the X and Y look, even down a microscope, you can kind of tell the difference between them in a metaphase spread.
Interestingly, though, for trisomy 21, the first principal component is less like the X and Y effects, but still positively correlated. So that tells you that although these three aneuploidies have a different fingerprint in how they hit the brain, the first principal component is positively correlated between these three dosage conditions, more so between the sex chromosome aneuploidies, but also to some extent between the presence of an additional X, Y, and 21.
And this lets you compute an average map of cortical vulnerability to human aneuploidy, which I'm showing here, where the intense red and blue regions are the areas that are most vulnerable to organizational change in structure and function, when you add a chromosome.
This map was striking to us because these are such distinct gene dosage disorders in terms of their gene content and size. And the initial question was what could account for this map of brain vulnerability? Are there any properties of cortical organization that follow this map? So we decoded this map against publicly labeled maps of gene expression in the human brain and found that those red regions of vulnerability within the ventral attentional system, the anterior cingulate and the rostral insula, are enriched for gene expression signatures that tag serotonergic signaling, the synapse in particular cell types.
I'm not arguing these cortical features are causal for regional brain vulnerability, but I do think this type of approach provides a data-driven framework for narrowing down the search space of molecular and cellular pathways that could potentially impart regional brain vulnerability.
More striking still, and this is the last finding I'll leave you with, is that the average map of human cortical vulnerability to aneuploidy was about .6 correlated with the principal component of human cortical change in multiple psychiatric conditions. Many of you are familiar with the ENIGMA consortium that maps structural change in the cortex for ASD, ADHD, depression, OCD, et cetera.
And the first principal component of cortical change in these very heterogenous behaviorally defined groups showed some spatial congruence with the first principal of cortical change in these three very different aneuploidy conditions. And this is a very exciting, I think, potential translational target for future study.
We're trying to get a handle on the molecular and cellular pathways that could underpin regional brain vulnerability through multiple complementary approaches. The first is developing computational tools the way you can take any neuroimaging map you like and decode it against post-mortem data to find gene expression signatures that look like that map.
The second approach we're taking is to go into the mouse, and to identify regions of brain vulnerability in the mouse, dissect those and understand the cell type changes that underpin the regional anatomical change; and the third is to take human-derived cell lines and in vitro model the effect of these gene dosage conditions on human tissue, and hopefully, these sort of complementary approaches will triangulate down on some recurrent signal.
Just some take-home points. I hope I've convinced you that aneuploidies can show high penetrance for major psychiatric diagnoses, but there's substantial symptom burden below diagnostic thresholds. There's real value to fine-mapping psychopathology, even though that comes with the risk of inflating ascertainment bias. The supernumerary X and Y chromosomes have this really surprising ability to have similar effects on behavior in the brain. And the degree to which this convergent effect of the X and Y chromosome is also partly echoed by the extra 21, and strikingly, these shared effects amongst these three aneuploidies actually echo the shared aspects of brain change in behaviorally defined psychiatric groups, pointing potentially towards a core axis of human cortical vulnerability in the context of neuropsychiatric impairments.
And the most important slide of all is the one that acknowledges the families that gave their time and their faith and investment in this study and the amazing team of researchers that did the work and it's been a pleasure to collaborate with. Thank you.
JONATHAN SEBAT: And our last talk in this session before the Q&A is Daniel Moreno DeLuca, describing his work on 17q12 CNVs and 15q13 deletions.
Setting the Ground for Precision Medicine in 17q12 CNVs and 15q13.3 Deletions
DANIEL MORENO DELUCA: Thanks, everyone. It is a pleasure to be here with you today, and I'm sorry that I couldn't be in person.
I wanted to tell you a little bit about our work in both 17q12 deletions and 15q13.3 deletions, with a particular focus on potential clinical interaction and community engagement.
PRISMA Research Group is located in traditional indigenous lands, and this is a very nice acknowledgement that I found is quite useful to my presentation here in Canada. So we are grateful for them allowing us to work in their communities.
No conflicts of interest on my end.
This is run of the mill, and we've heard this a lot, so I won't spend too much time on this. But we know that we are trying to bring together both behaviorally defined and genetically-defined diagnoses. And as a quick reminder, autism and neurodevelopmental conditions have a strong genetic basis that can explain up to 40 percent of cases, although no individual CNVs or SNVs tend to confer more than 1 to 2 percent of cases, which is one of the reasons why we are having this consortium and why I am so grateful at being part of this.
The other key finding is that there is quite a bit of heterogeneity and overlap, both in regard to the clinical presentation and the genetic etiologies, with a particular focus on pleiotropy, meaning that these individually rare genetic causes can often affect multiple organ systems.
This is another way of conceptualizing that same idea. We have the universe of people on the autism spectrum, as shown in the beige space under the umbrella, and we can see that that universe can be composed of many rare genetic causes coming together, plus some space that we have yet to identify. But then if we start with a different axis and we look at these specific genetic conditions, and I on purpose chose some of the most frequent ones to draw attention to how useful this approach is, even for conventional genetic conditions, we can see that they can fall more or less under that autism umbrella.
So it's a way of reconciling that heterogeneity, where we have cases, let's say, of Down syndrome where only a small proportion of people with the trisomy 21 can fall under the autism umbrella, while the converse is true for people with Fragile X syndrome. So this is a way of bringing together that heterogeneity, but that overlap as well.
Why is this relevant? Again, we've heard a lot about starting from the genetic standpoint. Let's start for a second with the clinical standpoint and with the phenotypic standpoint. So, we know that the population in the United States is about 340 million people, and we know that the population frequency of autism continues to rise, and it now touches the lives of about 1 out of every 36 people.
And we just reviewed that a genetic cause can be identified in up to 40 percent of people with this diagnosis.
Even by using conservative estimates, that means that there's over 1.8 million people in the United States with autism stemming from a genetic cause, and just to put that into perspective, there's almost twice as many people with autism and a genetic cause as people with HIV in the United States. Let that sink in for a minute. I know I'm preaching to the choir, but I'm also taking the opportunity that this is (inaudible) and let's think of the resources, both from the research and clinical side that are adequately allocated to HIV care, and now let's think about an equally sized or even much larger population that remains invisible because of lack of genetic testing and because of lack of awareness. How do we address this challenge?
We can do this by focusing on several different areas, and we'll touch upon three of these today very briefly in the next 11 minutes or so.
On research, we'll try to leverage understanding the diagnosis through the lens of genetics, and this brings us to the main focus of our work. Historically, 17q12 deletions, which we know are one of the recurrent copy number variants that we've heard about today, were not known to be impacting the central nervous system, and in fact they were quite frequent in people diagnosed with maturity onset diabetes of the young type 5, or with renal cysts, or other kidney complications, or uterine complications for biological females. But there was no overt mention of the neuropsychiatric implications of having this particular genetic change.
With our work back at Emory with David and Christa and James, so it's nice to have the team together again, we were seeing that people were being referred for autism as a reason for genetic testing, and we were finding these deletions. So we wanted to explore this in a little bit more detail.
Just to orient us, this is a quick overview of the 17q12 deletion region. We have the 15 genes that are involved in that interval, including HNF1B, which is known to cause several of the somatic phenotypes associated with this deletion, mostly the renal cysts and the diabetes. And this is just a patient with a deletion that we can see is flanked by those two segmental duplications.
After following up that story several years ago, we looked at a large collection of clinical generic testing labs where we identified about 18 people with this deletion, out of 15,000, while it was not present in any controls. So that led us to query additional autism and schizophrenia cohorts, finding that it was quite frequent in that group of cases combined, and it was very rare to identify this particular CNV in control populations. This allowed us to establish that it was a strong association with autism, schizophrenia, and clinical referrals -- or just with all of these psychopathology or in general.
This is one of the areas that I like the most, but it allows us to get to know patients a little bit more. Each of these columns is one of the patients of that original study, and there are some subtle facial features, as we can see, with a depressed nasal bridge, maybe a little bit of a higher forehead. But if anything, you could potentially think that these participants were related, rather than having easily recognizable clinical morphology -- clinical features -- associated with their deletions. We see how those features change over time as participants age. This is in addition to the renal cysts that also tend to be common, and the diabetes.
We know that that 17q12 deletion confers a high chance for autism and schizophrenia, that it is rarely seen in control populations, that many of the cases are de novo, and that when inherited from a parent, the parent tends to have a related psychiatric diagnosis. So we wanted to understand a little bit more what came after this, which led us to our current study.
What we're doing here is very much in line with the Genes to Mental Health Network and one of the reasons why we really wanted to be part and why we are grateful of being part of this effort.
We wanted to go beyond those dichotomous diagnoses, autism yes-no, to actually understand some of those domains in a quantitative way, which we know are distributed normally in the general population. And that applies not only to the behavioral domains, but even to some of the somatic clinical features that we were seeing as well, with a strong focus on remote and longitudinal data acquisition, which I'm glad we put in place in our proposal, because it was funded right before the pandemic. So thankfully we were geared to continue carrying this out as things moved forward.
We also want to understand the contribution of background genetic variation, as considered as polygenic risk scores. And we're hoping that this will set the basis for genomically informed clinical trials.
This slide, we already heard about this from Christa, which is another way of reconceptualizing how we consider that someone is impacted by one of these genetic conditions. I won't spend much time on this.
This is where we're at. Our group moved from Brown University to the University of Alberta fairly recently, so we had to reshuffle some of our protocols and get used to the new country and to the research in this new environment, but we right now have about 12 people with the deletion, and 15 with the duplication in 17q12, and this is a brief overview of their demographic information. It's fairly split between biological males and biological females. We have an overrepresentation of white race, but we're looking forward to ways of boosting representation of other racial backgrounds.
For people in whom we have inheritance information, we have several that were de novo, but we can also see some inherited cases, especially for the duplication. And then in regard to the instruments that we are using in our studies, we have a strong focus on medical history that aligns very well with what we were seeing and hearing about from Jen and the rest of the group. But also with some of the core features and core instruments that are being used as part of the Genes to Mental Health group, so we have CBCL/ABCL, the MINI neuropsychiatric inventory, RBS-R, SRS-2, and the PQ-B.
All of these as measures of psychotic or prodromal symptoms, social abilities, repetitive behaviors, general psychopathology, and we are also using the Penn CNB as a measure of speed and accuracy of cognitive abilities. We don't have those reports just yet, but it has been a great tool that we have thankfully, successfully implemented remotely and which we are hoping to help translate into Spanish to widen our referral pool.
Just taking a second to focus on that. We are also part of the Latin American Genomics Consortium, associated with the PGC, and we're hoping to launch these studies for Latin American population. As you may be able to tell my accent is Colombian, so I would really like to leverage some of those connections and some of those networks for the benefit of people with these rare genetic conditions in Latin America, as well.
These are some of the co-occurring conditions in these populations, and we see that autism spectrum disorder tends to be quite frequent, as well as general developmental disorders. ADHD, which tends to be more frequent in those with the 17q12 duplication than in those with the deletion. Interestingly, we did not find cases of schizophrenia in this population. And then we did find seizures more frequently in those with the duplication, compared to those with the deletion.
I am also showing data for the 15q13.3 deletion, which is a strong interest of our group, especially as we could be moving forward towards clinical trials in that space. I am quite aware of the time, and I'm not sure that I'm going to be able to go into that story, so if you don't mind going to the next slide.
I can't really see you guys, but if you want me, just let me know when you want me to stop. Please cut me off, I won't take it personally, but let me share a little bit about our 15q13 story.
We were hoping to leverage insights from clinical care to move beyond these assessments toward clinical trials, and the argument is that we could be using this genetic information to inform treatment for our kids and adults based on readily available means that are available now. As exciting as gene therapy and antisense nucleotides are, there are ways that we can leverage this information to put something in place in the clinic.
We like to call this the actionable genome, and there are several entry points to that actionable genome, and it could be susceptibility to specific medication side effects, or pharmacological treatment based on the underlying biology, or prophylaxis of other co-occurring conditions that may be psychiatric or somatic in nature.
This is just an example of how we have been doing that. Case 1 is an adult, a 39-year-old male with psychosis and rage outbursts and epilepsy, who was in treatment with a whole host of medications. In fact, this was a patient of Dr. Cubells back at Emory, who was described previously in the literature and who we re-included in our analysis.
Case 2 is a kiddo who we saw at Bradley Hospital with also very significant behavioral outbursts and autism, on treatment with lithium, trazodone, quetiapine, lorazepam, guanfacine, and a whole other medication. So echoing what Jen was telling us about that medication use, especially in relatively young people.
It turns out that they both had this 15q13.3 deletion, which we know includes the cholinergic alpha-7 receptor. So how can we leverage that information potentially clinically?
This is the region which we all know. It turns out that initial patient was placed on galantamine as a way of inhibiting the breakdown of acetylcholine and bringing that cholinergic (inaudible). And it led to a greatly reduced frequency of hospitalization for that case.
So we tried it again in our second case in the hospital, on the inpatient unit, and that also led to greatly reduced polypharmacy. Now the good thing and the helpful thing is that we already have evidence that this medication has been well-tolerated, associated with no major side effects in people with autism, and in a randomized double-blind clinical trial, but just not in this context. And it has been used in 15q13 deletion, but only once. And we also know that there's a strong component of aggression phenotypes associated with the animal models. So all of this made sense.
With this, I just want to show you the third out of the seven cases that we've recruited so far, where we started this medication. On the x-axis we see time, and on the y-axis we see the frequency of aggressive episode, averaged per week, on an inpatient unit. So we saw that they tended to diminish quite significantly over time.
Now, galantamine was not the only intervention here. These kiddos were also in an inpatient unit so they tend to get better over time, but this led us to discontinue several antipsychotics, which lowers that polypharmacy and can be an entry point for a well-informed clinical trial.
I think with that, we are putting this data together as a case series, and we are working on establishing this clinical trial, in close collaboration with the 17q12 community.
I think I will leave it at that, but the very last slide thanks everyone that has made all of this possible. We work very closely with the community, not only for our 17q12 families, but for 15q13.3 deletion families, and we want to make sure that we deliver the results of these studies that they participate in and that we hear directly from their priorities. So this is an important part of our research, and we put out newsletters and materials that is easily accessible for families.
Thanks to all who have made this possible, especially the families and our collaborators. With that, I will wrap it up.
Discussion and Wrap-Up
JONATHAN SEBAT: Okay, we have ten minutes left for our Q&A session. I'll take chair's prerogative and kick it off with a question for Armin. Obviously, it's really intriguing that increased dosage of the X and Y seem to be impacting the similar brain regions and similar psychiatric traits. Obviously there's the question of how, why, and so the question is what do you know about differentially expressed genes in these aneuploidies, and/or is it just the pseudo-autosomal region?
ARMIN RAZNAHAN: That's the question that's been on our minds, too. The fact that they have these convergent effects draws the mind first to what genes do they share? And the pseudo-autosomal ones have sequence homology and undergo full recombination like any other chunk of the genome, so that's one set of candidates. And then another set of candidates are these gametologue genes that have, are notable because they're the only ones that the Y used to share with the X, that have still been retained for some reason. And they’ve been retained over multiple species, and they're highly dose insensitive. So they show high sequence homology, but they don't undergo recombination any more.
In our expression studies in skin, LCLs, and induced neurons, we found that these gametologues are canaries in the mine. They are always differentially expressed and very highly. And they're involved in these key steps along the central dogma.So actually my current guess would be that it's the gametologues, they're higher on my culprit list.
Gametologues, these are ancestral X and Y chromosome genes that are retained copies on the Y, and they're distinct on Y, because whereas most Y genes are highly expressed in the testes, this subset of Y genes, the gametologues, are the only ones that have this broad expression in non-gonadal tissues. So they're at the top of our list at the moment.
SEBASTIEN JACQUEMONT: Just a follow-up question on Jonathan's -- are these gametologues the genes in the pseudo-autosomal region?
ARMIN RAZNAHAN: No, they're outside the pseudo-autosomal --
SEBASTIEN JACQUEMONT: So they are different. These genes are -- the ones on the Y are not on the X, and vice versa.
ARMIN RAZNAHAN: No. They exist as gene pairs that have an X member and a Y. For example, there's ZFX and ZFY.
SEBASTIEN JACQUEMONT: So they're not pseudo-autosomal genes, but they are present on the X and Y.
ARMIN RAZNAHAN: Exactly. They aren't autosomal in the sense that they no longer undergo recombination. What's interesting is they retain over 90 percent sequence homology for the most part, but the Y (inaudible) totally different regulatory contexts.
SEBASTIEN JACQUEMONT: So I have a question for Daniel. I think your example of the galantamine in 15q is a great example of translating this knowledge to treatment. I have a question, because I think this is also the role of G2MH is to prepare for clinical trials: in your experience with recruiting 15q, now think about doing a clinical trial, double-blind, well-powered; would this be something that would be even possible across sites in Canada, or would you need a North America multisite effort to get to the power required to see efficacy? I have my own answer to this, but I want to hear you.
DANIEL MORENO DELUCA: I think that the larger sample size that we can collect the better, and as we know these are pretty rare conditions in general. Oddly enough, and I know that back at Bradley Hospital and Brown University, where I was before, we are enriched for this population because we tailor care specifically to them. But we saw four of these kiddos within the same hospital system in a three-year period, which was a lot. Much more than I expected.
I agree with you that if we want to do this in a standardized way, especially if we are going to be adding inclusion criteria and controlling how many medications someone else is on, so it has to have that naturalistic flavor to it for us to be able to recruit large numbers of participants, and they also have to have the behavioral presentation, because that also highlights the tension between a genes first approach and then developing, as needing to pick an indication for medication to be able to help. So in an ideal world we could even preemptively prevent some of these clinical features from emerging. But right now we have to be responsive and reactive with a clinical indication.
I think it would be important to recruit internationally, but I'd love to hear your thoughts on that. I think I suspect where you're going with that question.
SEBASTIEN JACQUEMONT: I was just thinking that that would be impossible to run in one site. And it would be impossible to run even throughout multiple sites in Canada. We wouldn't get enough individuals, so I think it goes back to the relevance to G2MH, where like Geisinger, many sites are recruiting or seeing these kids with 15q13.3, so it could be a great infrastructure to support such a trial that I think would be impossible across maybe one or two sites.
DANIEL MORENO DELUCA: I completely agree, and I think the other piece for us to think about is how we can leverage the outcome measures to be more aligned with the work that we're trying to do. We've brought this up before. Can we move toward ecological momentary assessments, or more granular measures that allow us to pick up a difference following our intervention, much more easily than assessments that happen serially. How can we best leverage that piece, and there's some interesting work in sleep recording or heartrate variability or data coming from wearable devices that I think can enrich a clinical trial quite profoundly.
DR. FINUCANE: If I can follow up on that, too, Daniel, I'm really glad to hear you say that about other phenotypes, because we're talking about your two cases there, one going back to Emory, and really the 15q13.3 deletion has not been broadly phenotyped with regard to this aggression aspect of it. And you take any CNV and you can pick some isolated cases with severe aggression, you can go across the board to 22q or Fragile X, or anything else.
So I'd be a little hesitant to think about a clinical trial targeting just aggression. Obviously, you would want to really establish that that's a common part of the phenotype. Just as another data point, we've now returned, I want to say, 57 returns to adults who have 15q13.3 deletions who were picked up as part of our population health program, MyCode in Geisinger, and returned these to adults who didn't know their whole lives that they had this.
From what we can see, again, limited data for extensive phenotyping, but in talking to these adults, it does not appear that a highly aggressive phenotype was part of it, because we haven't seen the multiple hospitalizations you're talking about. Again, just a bit of caution there.
But what's interesting is you're picking out this idea of a druggable target, and then there may be other phenotypes that that could help as well, and it just so happened it helped the aggression for those two patients, but by broadening it, what if it does something overall for ADHD or something else?
RAQUEL GUR: Also, did you get input from the FDA?
DANIEL MORENO DELUCA: We did not get input directly from the FDA. But to go back to Brenda's comment first, I think it's a great point. I think we also have to be careful in regards to how to conceptualize aggression as an outcome measure, especially in the general audience, because there's a lot of stigma that may come with this, acknowledging fully that it's one of the main reasons why kiddos on the spectrum end up in the ER. Let's call that behavioral outbursts, and then if we could improve on those phenotypes, that would be dramatically better.
There could be several entry points, and I agree with you that as long as we get one that's tractable, it may benefit much more broadly than only for that given phenotype.
I'm going back to Raquel's question. We did not seek out input directly from the FDA, so these were clinical interventions based on off-label use of these medications, and we are reporting on the results of these clinical interventions. I think what's important to highlight is that -- it was in my slides -- is that these kids accrue medications of off-label uses with very significant side effects, so if we have some tools at our disposal, and unfortunately in child psychiatry we often have to live in that space of off-label medication use, as long as it's safe and has been proven to be used in children before, then I would argue that that is better than keeping people in this polypharmacy that includes mood stabilizers and antipsychotics, with all of the side effects that can come from that.
But it's a very good point, it's a fine balance. How can you get some of that traction to support the development of a clinical trial, but not overstep your boundaries, and I think do no harm is the number one principle that can allow us to accrue some of that evidence.
ANNE BASSETT: A question for both Daniel and for Jennifer, because I think there's some similarities in what you're talking about, and these are children -- because I don't hear you talking very much about adults -- with some sort of very severe phenotype that seems to be neurobehavioral, neuropsychiatric. I just wonder about did anyone in child psychiatry, were they ever able to give a diagnosis, a psychiatric diagnosis that was associated with these, apart from just behavior, which just drives me around the twist to hear. As opposed to actually trying to conceptualize it as a treatable diagnosis.
JENNIFER MULLE: Daniel, I'll go first. I believe the answer is no. I believe -- is it possible to diagnose generalized anxiety disorder in a three-year-old?
ANNE BASSETT: Not my area, but I think there are criteria.
JENNIFER MULLE: We don't know the answer to that. We don't know if our kids have a diagnosis or not. But one thing that I think is really compelling, and one reason why I'm pleased that we're extending into this area, is we may -- this is happening. The data exists. So it feels like we may as well take advantage of what's out there and ask.
So one of the questions we're asking is was it helpful? Did this help you, were there side effects? And think with that retrospective data, even though it's not ideal, it's not a clinical trial, we might be able to glean clues that, like, these things don't help, these do, let's start exploring.
We have a mouse model. Let's take medications that appear to be helpful and feed them to our mice and see what happens.
ANNE BASSETT: I think it would also be helpful if there was a psychiatric diagnosis that was associated. I say this because in the adult world, perhaps there is, to some extent there's some greater clarity with respect to the diagnostics, and I think that in children it's again, not my area, but it seems pretty diffuse at times, whereas we can see these emotional or temper outbursts, and I'm sure Daniel knows what I'm talking about as well.
But they're associated or they're worsened by having untreated or undertreated or inappropriately treated generalized anxiety disorder, social anxiety disorder, or a psychotic illness, and if you are able to make a diagnosis and provide appropriate treatment, the outburst melts.
JENNIFER MULLE: I think there's another rich piece of data here which is who is prescribing the medication? Is it a psychiatrist? Is it a pediatrician? I think there's so much richness here that it's incumbent upon us to explore what's going on. We've just scratched the surface.
JONATHAN SEBAT: I think we've reached the end of our time, so I think that now is the right time to wrap up our open session. Thanks to everyone at home who dialed into our Zoom call, and thanks for everyone who came here to D.C. for this open session meeting.
Then a lunch break now, and we'll come into our closed session after this.
STAFF: I will make a quick announcement that the video and transcript of this open session will be posted later. It will take us a little longer to get it up, but it will be posted on the NIMH website.