How to Request Datasets from dbGaP

The database of Genotypes and Phenotypes (dbGaP) was developed to archive and distribute the results of studies that have investigated the interaction of genotype and phenotype. Such studies include genome-wide association studies, medical sequencing, molecular diagnostic assays, as well as association between genotype and non-clinical traits. The advent of high-throughput, cost-effective methods for genotyping and sequencing has provided powerful tools that allow for the generation of the massive amount of genotypic data required to make these analyses possible.

dbGaP provides two levels of access – and – to allow broad release of non-sensitive data, while providing oversight and investigator accountability for sensitive data sets involving personal health information.

See the NIH Genomic Data Sharing (GDS) Policy which sets forth NIH鈥檚 expectations for the broad and responsible sharing of large-scale human and non-human genomic data. In addition, please refer to , Implementation Update for Data Management and Access Practices Under the Genomic Data Sharing Policy, which includes additional IT security requirements.

Requesting datasets from dbGaP includes steps within the eRA Commons as well as within SAGE.

Steps to Request Datasets from dbGaP

  1. Are you qualified and ready to do so?
    • Must be an employee of the 91探花.
    • Appropriate system credentials in . If you need access, see Commons Roles at the UW.
    • When requesting access to , you must have:
      • Requisite IT systems or a license to UW鈥檚 third-party computing infrastructure compliant with NIST SP 800-171.
        • See information from 91探花IT on
      • Completed the required training titled:
        • Save completion certificate to include with SAGE Request.
      • Authorized IT Director confirmation that the IT environment meets NIH Security Best Practices for Users of Controlled-Access Data.
      • Assurance signed by Approved User that the NIH Security Best Practices can be met.
  2. Review
  3. If you will request , reach out to an authorized IT Director for consultation prior to completing your Data Access Request (DAR).
    • Who is an authorized IT Director? For 91探花hosted systems, please see . For an NIH-hosted IT environment, please reach out to your department鈥檚 IT Manager or IT Administrator.
  4. Start your Data Access Request (DAR).
    • Choose datasets you wish to access.
      • Some datasets require IRB approval. See the Human Subjects Division guidance on obtaining IRB approval.
    • Select the Signing Official: Select the authorized official. Your OSP reviewer will update the Signing Official to themselves after they receive the accompanying SAGE request. See steps to Prepare your Request in SAGE to OSP.
    • In the DAR, list the authorized IT Director who has firsthand knowledge of the IT environment you intend to use. This is the same person who signs the IT Director Confirmation.
    • If using a Cloud Computing IT Environment ( 91探花Government Community Cloud or 91探花GCC), upload the 91探花Cloud Computing IT Environment Statement into the DAR.
    • Read the attestation language.
    • Add other necessary attachments required by NIH, such as IRB Approval.
    • Read and agree to the terms and conditions as the 鈥淎pproved User鈥:
      • Investigators and their institutions are responsible for safeguarding the accessed datasets. Pay close attention to the Data Use Certification (DUC) being made by you as an Approved User.
  5. Review and approve the Data Access request so it begins routing to the Signing Official.
  6. Download a copy of the DAR, then proceed with next steps to prepare your SAGE request to OSP.

Prepare your SAGE Request to OSP

There are two scenarios:

  • Is the DAR associated with an existing sponsored program? Route an OSP & GCA Modification Request (MOD) in SAGE.
    • Select “Federal Data Repository Access & Submission” in the Other Changes section of the MOD so it is routed to the correct reviewer.
  • Is the DAR not associated with a specific sponsored program? Route a Non-award Agreement (NAA) eGC1.
  1. Gather these items and attach to your Award Modification or eGC1:
    • Copy of the DAR.
    • A copy of a signed IT Director confirmation. This is the same person who is named as IT Director in your DAR.
    • If the dataset you wish to access requires IRB approval, a copy of the IRB approval.
    • Copy of the completion certificate for the required training titled:
  2. OSP will review the Award Modification Request (MOD) or NAA eGC1 together with the DAR in eRA Commons.
  3. Check status on 鈥淢y Requests鈥 page in eRA Commons.

Signing Official (OSP) Review

  • DAR is complete.
  • An authorized IT Director is identified.
  • A signed confirmation statement from IT Director is attached in SAGE to the NAA eGC1 or Award Modification,聽
  • Assurance statement signed by the Approved User is attached to SAGE item.
  • If the IT Environment used is 鈥淕CC High鈥, that PI has uploaded the 91探花Cloud Computing Statement in the DAR.
  • IRB approval, if needed, is attached to the DAR, and corresponds to the study in question.

Change Notes

7/07/2025 – Added sub-bullet for further clarification of SAGE steps “Select “Federal Data Repository Access & Submission” in the Other Changes section of the MOD.”

3/20/2025 – Removed requirement for PIs to provide an SSP and references to the SSP. The SSP is on file with 91探花IT Director for each 91探花system listed on the UW-IT website. Added information on who to reach out to for NIH hosted environment.