NIMH Data Archive (NDA) New Grantee Orientation Webinar
TRACY KING: My name is Tracy King, and welcome to today's Orientation webinar.
If you have any questions, you can e-mail our helpdesk, and we will answer them in a follow-up e-mail. Our help desk e-mail address will be provided to you at the end of the webinar.
The goal of today's Webinar is to emphasize that planning early for data sharing will make it much easier at submission time.
We hope to show you the NDA tools and resources that are available to you and offer you some helpful tips along the way.
Today's webinar is the first in a series of webinars. As mentioned, it's primarily aimed at principal investigators. However, other staff are welcome and encouraged to attend. This training focuses on the first tasks that are expected to be completed early after grant award. The second webinar is about data harmonization. It's aimed at staff members who will be working to format and submit the actual data.
This training dives into the meat of the NDA data dictionary. Attending data harmonization training early on will teach you how to format the data. We urge you to take advantage of the already defined data structures in the NDA Data dictionary that can be used right away to create a method for data curation.
Using the Data Dictionary has the added benefit of already being formatted for data submission to NDA.
The third webinar instructs on how to validate and submit data. This webinar overlaps where the data harmonization webinar left off and is offered around the two submission cycles.
This webinar helps prepare users to understand the process of validation, error handling, submission, and post submission quality assurance and quality control.
Data Access is our last in the series. It discusses the process of requesting data access and covers with various methods to query package data and get data from NDA.
NDA can provide recordings of any of our previous webinars if you e-mail our helpdesk.
We hope that you walk away with an understanding of the next steps and know what you need to do to contact us throughout the data submission process if you need any assistance.
Let's give a little bit of background about the NIMH data archive, or NDA as we call it.
NDA is more than a database, it's a platform that supports the sharing of human subjects data.
Housed at NIMH, NDA actually started as the National Database for Autism Research, or NDAR, in 2006, and originally focused on autism related data collected through grants funded by several NIH institutes.
In 2014, NIMH further supported the importance of data sharing, and expanded this expectation to most human subjects, grants that it funded. NDA continues to expand today.
As you explore the NDA website, you will see references to each of the separate repositories shown on this slide.
Currently, qualified users can request access to one or more of the repositories.
Before we jump in, I wanted to show you one slide that visualizes the entire data submission and sharing process. As we go along, we will show this slide, again, to show the Webinar Progress. Don't worry, not all of the steps, will be covered in detail during this presentation. We are only going to about halfway, up to about the red line in detail.
The good news is that you've already completed two of the steps prior to award, so you may already be familiar with the data sharing language and expectations, so now we're going to pick up on activities that occur at the time of grant award.
Here is how to find the data sharing terms and conditions.
Becoming familiar with the terms and conditions should be one of the early steps after grant award. I encourage you to view these terms as soon as possible so that you know the expectations and you reduce any misunderstandings.
So now that we've gotten through the grant award Segment, remember to refer to the NDA Sharing Terms and Conditions frequently, it will be one of your biggest resources.
Now we're going to move into the post award segment.
You most likely already received the initial welcome e-mail. This e-mail prompts you to complete the NDA data submission agreement or DSA and to create an NDA user account.
NDA will communicate with you throughout the entire grant cycle to keep you on track and to provide you with helpful information. To do this, we send a series of automatically generated e-mails.
A couple of tips, the e-mails we send are from the NIMH Data Archive Address, you should add our e-mail address to your Safe senders list so that you don't miss any important correspondence. As always, feel free to contact the NDA helpdesk if you have any questions or need more information.
The data submission agreement or DSA, is a policy document that is required prior to being allowed to submit any data.
If you haven't done this yet, you should do it now. It doesn't take long to fill out and is an easy task to complete.
Plus, it is defined as an expectation on the NDA sharing terms and conditions and it can hold up your project later.
With the eDSA tool, you can populate, review, and submit the NDA data submission agreement. The agreement must be signed by the principal investigator and an authorized Institution of Business official.
Principal investigators on NIH grants with NDA sharing expectations are expected to complete the DSA within six months of initial award, the sooner the better.
Log in with your eRA Commons username and password to begin. Once you've completed the data submission agreement, you will select a signing official at your institution who will sign off on the document electronically.
If you're not sure who at your institution is appropriate for you to send the data submission agreement for approval, the NDA helpdesk can e-mail a list of your institutional signing officials. To contact the NDA Helpdesk, e-mail NDAHelp@mail.nih.gov with any questions or concerns. Now you can check the DSA off your list.
Now we're going to move onto the NDA Collection.
Once we receive the DSA from you, the principal investigator (PI) is granted ownership of their NDA Collection and a second automated e-mail is sent.
This e-mail covers the next steps in the process and includes the direct URL to your NDA Collection.
You should note your NDA Collection number, which is a four-digit number. It will come in handy if you contact the NDA helpdesk.
Those of you who have not worked with NDA before are probably wondering what an NDA Collection is.
A Collection is a virtual container for data and other information related specifically to your project.
Grant information is also contained in the Collection. A Collection is created for you by the NDA staff. To get to your Collection, you can use the direct URL that we e-mailed to you, or you can take the following steps.
First, login to the NDA website, click on My Dashboard at the top right of your screen. From there, you will click Login, enter your username and password, and you'll be on your profile page with your NDA Collection and the right-hand side menu.
So, now, let's get familiar with the Collection. Here's an example of a newly created Collection.
There are several tabs at the top, but we're only going to be concerned with three, at this point, the General tab, the Data Expected tab, and the Permissions tab.
On the General tab, you will find information related specifically to your project. If the project is associated with an NIH funded Grant, then the grant number will be provided.
Collection owners and administrators can modify any of the items that are within a frame on this page. A quick tip, when you make changes, don't forget to click Save.
In order to assign Collection privileges to your staff, you need to login yourself.
To add another user to your Collection, such as a data manager or other staff members, you need to know their username, or the e-mail address associated with their NDA user account. If they haven't already set up an account, they need to do this first.
Once you click the “Add User” button, a green message will appear at the top to indicate that the user has been added, and their information will appear in the section below. There are three types of privileges that you can assign to a user. Submission allows for uploading data.
Query gives read-only privileges, and admin is both submission and query privileges, plus the ability to give others privileges, so that they can set up Data Expected, for example, a data manager or somebody who's working with the data. Again, be sure to click Save after giving permissions, or the users won't be saved.
Giving permissions to others may turn into a recurring task throughout the grant cycle. As staff come on board or transition out, you want to make sure that appropriate users are added. If staff members leave, please let us know so we can deactivate their NDA user account.
A couple of tips here.
First, as I mentioned, we send out automated e-mails. The default is to send these to NDA Collection Owner, which is usually the PI, but it may be more appropriate for another person on your staff to receive the e-mail notifications instead. Alternately, you may want others to receive the e-mails in addition to the NDA Collection Owner. These settings can also be changed within the Permissions tab.
For now, we're going to move away from the NDA Collection. But we will return to talk about it at the Data Expected list. Now we're going to talk about your consent plan.
As it is still early after award, you will likely be working on your protocol and consent documents to submit to the IRB. Grant recipients must include language in the informed consent form, agreeing that data and supporting documentation submitted to NDA may be accessed and used broadly by approved users for research.
Sample consent language is available on our website to help researchers, but it's important to note that it is outside the purview of NDA staff to review informed consent documents.
Related to consent is the GUID.
The Global Unique Identifier, or GUID, is an ID number that is required for each subject.
While the GUID is based on personally identifiable information, or PII, that PII is never transmitted to NDA. In fact, it never leaves your site.
To create a GUID, you will need to ask the subject for their name, date of birth, sex at birth, and city of birth.
This information is found on a birth certificate. You should ask for their given name instead of a nickname. So, for example, Jonathan versus Johnny. Think about how you will build this into your processes ahead of time so that you get what you need the first time, and you don't have to recontact subjects. Also, any staff members who will be involved in creating GUIDS will need to have permission to access the GUID tool, and you'll need to grant them permission in your NDA Collection.
To obtain GUID tool access, contact the NDA helpdesk to request access to the GUID tool. Access to create GUIDs requires that you have NDA submission permissions on an NDA Collection.
If you don't submit data to NDA, then permission is granted on a special case by case basis. If unsure, please contact the helpdesk.
Now it's time to focus our attention back to the NDA Collection. I just want to point out that the tasks that we've completed so far, and the next task, don't necessarily need to be completed in sequential order. You or designated members of your staff can and should perform these tasks simultaneously.
A moment ago, we looked at the Collection and pointed to the three tabs that require your immediate attention. So far, we've covered the General tab and the Permissions tab. Now, we're going to look at the Data Expected tab.
The Data Expected tab is the primary tracking mechanism for your data submission project, and you or a member of your staff can set it up in your NDA Collection.
This slide summarizes the purpose and other key elements of this feature. Simply put, it is just a list of all the data that you are collecting, with the associated subject counts and initial submission, and sharing dates for each item.
The Initial Submission and Share Dates you said are based on the standard NDA sharing terms.
When the NDA Collection is created, the Data Expected list is seeded with a research subject and pedigree item, with a default initial submission date based on your start date.
This is a summary structure expected from all projects. Not only does this list, once completed, provide a common understanding of what is to be submitted and when, but it also allows for flexibility in the event that your project's timeframe changes. Many people put off creating the Data Expected list for fear that it will change. Don't worry if things change. You can change the Data Expected list at any time.
When creating your Data Expected list, you should refer to the NDA Data Standards page on the NDA website for using existing assessments, submitting new assessments, and instructions on how to submit all kinds of data.
When you first get your NDA Collection, the Date Expected list will look like this. As mentioned, NDA staff seed each Collection with a single record.
The research subject data structure for genomics and subject data for structure for omics project is expected of all newly awarded grants to provide basic information about the subject, such as demographics, pedigree, links, family, GUIDs, diagnosis, phenotype, and sample location that are critical to allow for easier querying of shared data.
When grant recipients update the data expected definition, they are expected to update the subject count, submission date, and share date for this project item also.
If someone like your program officer were to view the Data Expected tab and see only one data item listed, they would know that your Data Expected list had not yet been completed.
Let's look at an example of a newly completed Data Expected list.
As we mentioned, it's ultimately just a list of items that you're collecting and the expected enrollment numbers with initial dates for both submission and sharing. We'll be going over how to use your account to set this up in greater detail in our next webinar on Data Harmonization. But for now, we're going to go over it briefly, so you have an idea of what the process looks like.
When we say you are adding items to the list, these items are tied to one or more specific data structures in the NDA Data Dictionary.
The NDA Data Sharing Terms and Conditions and the NDA Data Dictionary will be the most helpful resources to you as you complete your Data Expected list.
To get started adding these items to the list, you will click on the green, “New Data Expected” button.
Then you will see this dialogue for adding an item.
If your data has an already associated structure in the Data Dictionary, you'll be able to find it and add it to the list using the default options by data structure title as shown here.
In this example, I want to add the SCID V anxiety disorders. And I know that it's already been defined in the data dictionary. So, I'm going to use the search by data structure, title option. I'll start to type in the data structure search and see that it appears in the NDA Data Dictionary. I see several options, but I want to select the one for anxiety disorders.
I can enter my targeted enrollment, let's say, 600 subjects, and set the dates based on the definitions in the NDA Data Sharing Terms and Conditions. I set the dates based on that determination and the appropriate timeframe for my project.
Now, in this case, finding the data item I needed was easy, but what if I'm planning to use a questionnaire, and I can't find it in the data dictionary?
In this case, you want to choose upload definition option.
You will still need to define the targeted enrollment as well as the initial submission and initial share dates. You will then provide a title for the measure and upload the file. This could be your electronic definition if you created one. In any case, you will want to provide the actual measure and any instructions and scoring algorithms that can help our Data Curation team work on a structure for you. If you have more than one file to upload, you can zip them and upload the zip file. We just want to point this out, because how to get a new data structure created is a relatively common question, and this is the process to add new structures to your Data Expected list.
Once all the items have been added, you’ll have a completed list, as we showed earlier, and you'll be able to come in and update the dates or counts as the project proceeds, if changes are needed to keep it up to date. For our purposes today, this brief overview of what the process looks like should be helpful, but we'll be covering it in more detail in a later session, so we recommend that whoever is going to be completing the Data Expected list attend our next webinar, or view this recording, or one of the other recordings. This slide just shows some summary information on the Data Expected list.
So, we've completed all of the steps as part of the initial setup of your data sharing process and project. We encourage you and your staff to register for our next webinar in the series, Data Harmonization, as well as to provide more detail on instructional material on how to complete the Data Expected list and additional tasks associated with preparing for submission. As mentioned, all of our webinars and tutorials can be found on our website.
Since we'd like to emphasize that your data needs to be harmonized to ours, we'll also briefly outline the next phases of the process.
The first is the NDA Data Dictionary.
The Data Dictionary is a collection of all measures, instruments and assessments currently harmonized in NDA through the definition of a standard data structure.
See the full data dictionary for a list of these definitions. Click on one to enter its individual structure page and view the defined elements. These can be referenced when formatting your data collection.
Data cannot be uploaded until a corresponding data structure is already defined in the data dictionary.
So, that's just an example of what the data dictionary looks like. You can locate it many different ways from our homepage, and from some of our drop-down menus.
Now, we'll talk a little bit about the standard process of cyclical data submission and interim sharing represented here by the data QA/QC and interim sharing steps.
In the data submission step, you will begin uploading data as it reaches your initial submission date as it is defined in the data expected list either in cumulative, biannual submission cycles when published or at the project end date.
In the QA/QC step, each submission of data has run through a set of automated quality checks that NDA developed, and you'll be given the opportunity to correct any detected issues.
Then, in the Interim sharing stage, your ongoing, cumulative dataset is shared typically four months after submission.
This process then repeats bi-annually over the course of your grant.
If you are collecting FMRI, eye-tracking, omics, EEG, EGG, or any other type of event based or neuro signal recordings data, you will need to use the experiment definition told during your submission.
You can access this in the Experiments tab of your NDA Collection.
The experiment definition provides information on acquisition, processing and other parameters that will help researchers performing secondary analysis understand how your data was collected.
Once it is completed, you will receive an ID number that you will need to enter in associated subject records when uploading associated raw files. Each experiment type has a similar interface and preloaded options. When completing all required fields, keep in mind that the purpose of the definition is to provide as much information as necessary to recreate the experiment.
So, here, you will see what the experiment definition looks like. You can see the equipment and scanner checklists that you can choose from, the different software stimuli, and then you would just add new as you go along.
Again, when you get to this process and after you've completed your data expected list, you will be assigned a data curator to your project who can help you with any of these things. If you have any questions, again, just contact the NDA Helpdesk.
We'll also take a quick look at the NDA Study, which is how you'll define publications in the system and tie them back to the underlying data in NDA.
An NDA Study describes an analysis along with the cohorts, measures and methods of analytics, and references the underlying source data from the NDA Collection, so others can access data specific to a publication or other public disclosure of results.
Data are shared at a granular level, for example, specific outcome measures by specific time point.
Data are shared at a granular level when the NDA Study is created, which allows other unpublished data that are not scheduled for sharing to remain private.
A DOI is issued for each NDA Study, and authors are expected to reference the location of the data in the publication using the DOI.
To do this, authors are expected to contact the NDA helpdesk at NDAHelp@mail.nih.gov, when a manuscript is submitted for review so that an NDA Study and the DOI can be issued and included with the actual publication.
A DOI for the underlying data may increase the visibility of your work.
We also wanted to mention a few resources available on the NDA website that can be helpful over the course of your project but are not necessarily part of any specific step in the data sharing life cycle. The first is the NDA Search tool. It is on our homepage, as shown. With it, you can search on text strings across the website content, NDA Collections, or Studies, as well as elements and structures in the NDA data dictionary and many other categories.
The second of these is the NDA Query tool. The Query tool provides numerous ways you can filter and download data from the database. It can be found on our website under the Get Data tabs.
You may have noticed some of the benefits of sharing data using NDA.
You can use NDA as your data manager to a certain extent.
We have predefined data structures, so there is no need for you to do upfront work or work with another electronic data capture system. Using NDA data structures upfront, can be extremely useful.
Everyone wants to start with analyzed clean data. The NDA Validation & Upload tool, and post submission QA/QC processes can help the quality of your data. Many projects have multiple performance sites and wish to share data among these sites.
Submitting to NDA facilitates access by all investigators working on a project, even before the data have been shared with others, you can control who gets access to the data while it is in the private state.
Data are portable, regardless of your location or institutional affiliation, so you don't have to send spreadsheets to collaborators or carry data around on a flash drive.
Data are also secure. NDA staff have implemented comprehensive backup protocols to ensure that data are never lost.
Storage costs are not an issue because NDA absorbs the costs and there's no charge to submitters or users who access the data.
NDA study and the resulting DOI have the potential to increase the ability and visibility of your work.
So, just as a recap, these are the things that you'd need to do next.
First, review your NDA sharing terms and conditions, complete the NDA Data Submission Agreement and upload or e-mail it to NDA, provide your staff with Collection privileges, create the Data Expected list in the NDA Collection.
Plan ahead for collecting PII.
Collect what is needed to create a GUID. Remember, this is information found exactly as it is on the birth certificate.
Ensure data sharing is covered in the consents and register for additional upcoming webinars.
Today’s webinar is at its conclusion. Sign up for our next webinars and share it with your colleagues. As always, you should feel free to contact us if you have any questions. There are resources on our website, such as video tutorials, that walk you through step by step of many of these processes.
Thank you, everyone, for attending today. Again, please, submit any questions and feedback to our Helpdesk NDAHelp@mail.nih.gov. Thanks a lot and have a great day.