NIMH Data Archive (NDA) Data Submission OH Webinar
LORRAINE SIOCHI: Hello everyone. Thank you for joining our NDA data submission office hours webinar.
Today we are here to learn the general NDA data submission process during a Submission Cycle.
Some of the topics that we'll discuss today are who, what, when, and how to submit data, the prerequisites to data submission, how to check if your Data Structures are ready for submission, Submission Templates, Mandatory Data Structures, and submission exemptions.
Let's start off with who should be submitting data and when?
NDA expects you to submit data biannually by the deadlines of January 15th and July 15th if your NDA Collection is in the NIMH Data Archive Data Repository, your Collection phase is in the Pre-Enrollment, Enrolling, or Funded Completed Data Expected phase, and three, if your Data Expected list has an initial submission date set to the deadlines of January 15th, July 15th or earlier.
NDA expects you to submit biannually by the deadlines of April 1st and October 1st if your NDA Collection is in the NIAAA Data Archive Data Repository, your Collection phase is in the pre-enrollment, enrolling or funded completed Data Expected phase, and if your Data Expected list has an initial submission date set to the deadlines of April 1st, October 1st or earlier.
Before you can start submitting your data, there are four items you need to complete and have.
You need the NDA Data Submission Agreement or the DSA, the Data Expected list, permissions to submit, and Global Unique Identifiers or GUIDs.
So first, the Data Submission Agreement must be submitted through the eDSA tool, which is the electronic Data Submission Agreement tool.
The PI of your NDA Collection should follow the instructions they received in the Welcome to Data Sharing e-mail when their NDA Collection was first created.
Once the PI completes the DSA, they'll be given access to the NDA collection and the Data Expected list.
The Data Expected list is the list of all Data Structures that you'll submit data for.
Access to edit the Data Expected list is provided to those with admin permissions to the NDA Collection.
Your PI can provide you these permissions via the Permissions tab of the NDA Collection, which brings us to our next point, which is the permissions needed to submit.
You must have access to an NDA Collection and at least have submission permissions to submit data to the NDA Collection.
You can see if you have access to the NDA Collection through the Collections or compliance dashboard, and your PI can provide you these permissions via the Permissions tab of the NDA Collection.
As you can see in the bottom picture, towards the right, there are three different levels of permissions that you can have- admin, submission, or query.
Again, you must have submission permissions to submit.
Our last prerequisite are the Global Unique Identifiers, or GUIDs. These are required for all data submissions.
Access to create GUIDs through the NDA GUID tool is requested through the NDA Help Desk.
In order for the NDA Help Desk to provide you permissions, you must have submission permissions to an active NDA Collection.
A little bit more about the GUID- these are alphanumeric codes used as an identifier for every research participant that you submit data for.
In order to create a GUID, you must have the first, middle and last name of participant, the sex at birth, date of birth and city of birth.
You should use the information from the participants birth certificate to ensure that all the information is correct when creating your GUID.
The NDA Data Sharing Terms and Conditions include the expectation that you will collect the data necessary to create real GUIDs. However, if you do not have the information to create a real GUID, or you're unable to obtain that information from the participant, pseudoGUIDs can be requested in place of GUIDs.
In order to request pseudoGUIDs, please have the PI provide the NDA Help Desk a detailed justification for the use of pseudoGUIDs, the number of pseudoGUIDs needed, and whether pseudoGUIDs will be needed for more than one Submission Cycle.
If at a later time you obtain the information to create a real GUID, but you've already used the pseudoGUID, you can use the NDA GUID Tool to promote your pseudoGUID to a real GUID, linking the pseudoGUID you previously submitted to the newly created GUID.
Just to recap, the four items you need before you can start submitting data are the NDA Data Submission Agreement, the Data Expected list, the permissions to submit, and Global Unique Identifiers.
Once you have the DSA and the Data Expected list completed and you have the permissions to submit and you've created GUIDs, you should submit all the data you've collected for your grant.
Do note that you do not need to submit for all Data Structures every Submission Cycle.
You only need to submit for the Data Structures that are ready for submission and that you do have collected data for.
When submitting your data, specifically clinical and phenotypic data, you should submit these data cumulatively.
By this I mean, let's say your first submission includes 5 subjects at timepoint A.
After your first submission, let's say you collected 10 new subjects at timepoint A, along with additional data from those original 5 subjects at a different timepoint, timepoint B.
Then, in your second submission, you should include the original data from the five subjects at timepoint A, the five subjects at timepoint B, and the 10 new subjects at timepoint A, totaling in 15 subjects in your second submission.
But if you're submitting imaging EEG or omics data, these are non-cumulative submissions, meaning you only need to submit these data once.
If you see that some of your Structures are not yet ready for submission, you should still submit data for the Structures that are ready for submission and that are approved.
To check if your Structure is ready for submission, visit your Data Expected tab and, in your Data Expected list, you should see an information “i” icon next to your Data Structures name.
Click that “i” icon and you'll see one of two messages.
You'll either see “Structure not yet defined”, which means that your Structure is not ready for submission and our data curation team is still working on your Data Structure.
But if you see a table with your Data Structure title and short name, this means your Structure is ready for submission.
If you see that your Structure is ready for submission, your next step is to download the Submission Template for that Data Structure.
Your Submission Template is what you use to enter your data and to submit your data into NDA.
To download your Submission Template, click on the short name hyperlink after clicking the information “i” icon.
This will take you to your Data Structures page.
On your Data Structure page, you'll see a download section towards the right. Click on “Submission Template”.
This will download a CSV file which will look something like this. You'll see that the first two rows are prepopulated.
The first row provides the Structure, short name and the version number, and the second row provides all the Data Elements for that Data Structure.
When completing your Submission Template, do not change anything about the first 2 rows.
You can certainly hide columns or Data Elements that you don't use, but you should not delete any columns, do not reorder any of the columns either.
When filling out your Submission Template, each row will equal 1 subject at one time point. So if you have one subject at three different time points, you would have 3 rows for that single subject.
When completing your Submission Template, we advise having the Definitions download or the Data Structure page open so you can see the different data types, the size limits, the values that are acceptable for that Data Element, and if the Data Element is required.
Required Data Elements in every Data Structure must have a value.
Conditional Data Elements should have a value if applicable to the condition.
Any Recommended and optional Data Elements can be left blank if not applicable to your study.
If you did not collect a required Data Element, we advise reviewing the Value Range and Notes column of the Definitions file or the Data Structure page to see if there are NA or null values that you can enter.
So here, we can see that the sex Data Element you can provide an NR value, meaning “not reported” if you did not collect that data.
Every single Data Structure in NDA will have these five required Data Elements that you must provide: subjectkey, src subject ID, interview date, interview age and sex.
Once you have your Submission Template filled out with all the Required Data Elements, along with any Recommended Data Elements that are applicable to your study, you can submit your data using the Validation and Upload Tool.
In order to submit your data using this tool, you must have submission permissions to your NDA Collection.
There are two different versions of the tool that you can use. The first version is the HTML Validation and Upload Tool, this is our web version. This is best used for cumulative data like your clinical and phenotypic data. We do have a tutorial on our website that goes over step-by-step on using this tool.
The second version is the Python Command line version of the tool. This is best used for larger files like your noncumulative imaging files, and any files over 2.5 gigabytes.
You must use the Submission Templates from the NDA Data Dictionary, like I just showed you, or it's provided to you directly by Data Curator to submit your data. Otherwise, you will come across errors, and you won't be able to submit your data.
A little bit more about the tool. The tool will validate your data against the NDA Data Dictionary to ensure that your data is harmonized correctly with your Data Expected list.
We highly recommend using it anytime you collect new data because the first step in the tool is to validate your data against your Data Expected list.
When you're in the tool and once you've uploaded your Submission Templates, you may come across warnings or errors. You must resolve any errors that are found in your data, but you can still submit if just warnings are present.
If you do come across any errors or messages that you're unsure of how to resolve, please use the request help feature which will create a Help Desk ticket for you.
If you see that none of your Structures are ready, as in you see Structure not yet defined for the Structures that you do have collected data for, don't worry, we do have a work around for you.
But please note that there will always be at least one ready to submit Data Structure in your NDA Collection which are your Mandatory Data Structures.
Your Mandatory Data Structures are prepopulated by NDA when your NDA Collection is created.
All NDA Collections will have the Research Subject and Pedigree Data Structure.
If you're submitting omics data, please submit for the genomic subject 02 Data Structure.
If you're not submitting omics data, please use the NDAR subject 01.
In your NDA Collection, you may also see these NIMH CDE, or Common Data Element Data Structures. These are added to NDA Collections where the grant is subject to the guide notice NOT-MH-20-067.
If you have any questions regarding the requirement to submit for these Data Structures, please e-mail your program officer.
For all Mandatory Data Structures and all the Data Structures listed in your Data Expected list, please update the Targeted Enrollment number, which is the total number of subjects you expect to submit for that Data Structure.
Then submit your data cumulatively.
In the event during a Submission Cycle where you either don't have any data to submit or the Structures aren't ready in your Data Expected list, you can submit a Submission Exemption which notifies NDA that you will not be submitting any data for that Submission Cycle.
There are three types of Exemptions that you can submit.
Type A- No New Data This Period. This means that you don't have any new data to submit.
Type B- Deadline Extension. This typically means that you do have data to submit, but it will likely be ready to submit after the submission deadline.
And lastly, we have No New Data- Change Phase to Data Analysis. This Exemption is typically submitted when you've completed data submission for the entire grant, and you do not expect to submit any more data to your NDA Collection.
And our last slide, if you've previously received a QA Results Report notification from NDA, you should first correct your QA errors and resubmit your corrected data before submitting any new data during a Submission Cycle.
You should not submit new data in your resubmissions.
Please follow the instructions that you received in your QA results notification to correct your data.
And that is the end of our presentation. If you have any further questions or would like a copy of the slide deck, please e-mail the NDA Help Desk at NDAHelp@mail.nih.gov.
All the information from this presentation is located in our Submission Cycle FAQ page, which is linked on the screen. Thank you for joining!