Autism Data Science Initiative Data Resources

Navigation #navigation

Example Table Information #extable

Field NameField Definition
Repository Full NameThe proper name for the repository and/or acronym spell-out
Repository Short NameAcronym or short name for repository if one exists
Brief DescriptionDescription of the repository and the purpose it serves
Web AddressHomepage URL for the repository
Help EmailEmail link for general public to contact repository staff
AffiliationThe organization that hosts and maintains the database and associated software
OrganismThe types of organisms from which data are shared in the repository
Research AreasThe research domain(s) for which the repository shares data
Data Types Keywords for types of data associated with the repository. Data types are sourced directly from the repository website without further interpretation; different repositories may use different terms to describe the same type of data. 
Controlled Access Whether the repository has a controlled access option for access to datasets: "Yes", The repository includes controlled access option; "No", The repository does not include controlled access option; "Unclear", Unclear if the repository includes a controlled access option, or repository website does not specify this information. 
Data Access Control Description List of which options the repository offers for access to hosted datasets: "Open access", No access restrictions or registration required to access; "Registration required", Open to all, but users need to be signed in or registered with the resource to access; "Controlled access", Requires verification of requestor identity and the appropriateness of their proposed research use to access protected data by some review process/committee; "Enclave", Controlled access where data cannot be downloaded or removed from a specific environment. 
Data Access Control Links URL to information about data access controls 
Fairsharing Link URL to the fairsharing.org listing where one exists. 

Return to Navigation 

Table 1: National Institute of Mental Health Data Archive  #table1

Field NameField Definition
Repository Full NameThe National Institute of Mental Health Data Archive 
Repository Short NameNIMH NDA
Brief DescriptionThe NIMH NDA is a large-scale repository that houses a wide variety of data related to autism, including behavioral, clinical, genetic, and neuroimaging data. Researchers can access de-identified data with appropriate approvals. The NDA encompasses data from the National Database for Autism Research (NDAR), the National Database for Clinical Trials related to Mental Illness (NDCT), the Research Domain Criteria Database (RDoCdb), and the NIH Pediatric MRI Data Repository, Adolescent Brain Cognitive DevelopmentSM, (ABCD) Study, the Connectome Coordination Facility (CCF), the Osteoarthritis Initiative (OAI), the National Institute on Alcohol Abuse and Alcoholism Data Archive, the Helping to End Addiction Long-term Initiative (NIH HEAL), the NeuroBioBank Data Repository, and the PsychENCODE Consortium. Researchers can access de-identified data with appropriate approvals. 
Web Addresshttps://nda.nih.gov/ 
Help Email[email protected]
AffiliationNational Institute of Mental Health (NIMH)
OrganismHuman subjects
Research AreasClinical studies, Medicine, Autism, Mental Illness, Cognitive Development, Neurology, Osteoarthritis, Alcohol Abuse and Alcoholism, Triplet Repeat Disease
Data Types phenotypic data, imaging and other neurosignal recordings data, and genomic/pedigree data related to mental health on human subjects
Controlled Access Yes
Data Access Control Description Open access; Registration required; and Controlled access
Data Access Control Links https://nda.nih.gov/nda/access-data-info
Fairsharing Link https://fairsharing.org/3209

Return to Navigation 

Table 2: NICHD Data and Specimen Hub #table2

Field NameField Definition
Repository Full NameNICHD Data and Specimen Hub
Repository Short NameNICHD DASH
Brief DescriptionNICHD Data and Specimen Hub (DASH) allows researchers to share and access de-identified data from studies funded by NICHD and also serves as a portal for requesting biospecimens from selected DASH studies. DASH hosts deidentified datasets from clinical and population health studies funded by NICHD and relevant to the NICHD mission, including the National Children's Study and the Environmental influences on Child Health Outcomes (ECHO)-wide Cohort study
Web Addresshttps://dash.nichd.nih.gov/ 
Help Email[email protected]
AffiliationEunice Kennedy Shriver National Institute of Child Health and Human Development
OrganismHuman subjects
Research AreasLife science, Critical Care Medicine, Pediatrics, Biomedical Science, Clinical Studies, Demographics, Gynecology, Obstetrics, Pharmacology, Social Science, Medicine, Musculoskeletal Medicine, Reproductive Health, Behavior, Sleep, Safety
Data Types Research data and biospecimens
Controlled Access Yes
Data Access Control Description Controlled access
Data Access Control Links 

https://dash.nichd.nih.gov/resource/policies


https://dash.nichd.nih.gov/resource/request 

Fairsharing Link https://fairsharing.org/FAIRsharing.dYSI4O

Return to Navigation 

Table 3: Database of Genotypes and Phenotypes #table3

Field NameField Definition
Repository Full NameDatabase of Genotypes and Phenotypes
Repository Short NamedbGAP
Brief DescriptionDbGaP archives and distributes the data and results from studies that have investigated the interaction of genotype and phenotype in humans. It includes genomic data from the NIH-funded Autism Sequencing Consortium and additional relevant studies. 
Web Address https://www.ncbi.nlm.nih.gov/gap/  
Help Email[email protected] 
AffiliationNational Center for Biotechnology Information, National Library of Medicine
OrganismHuman subjects
Research AreasBiomedical Science, Genetics, Epigenetics, Expression Data, Genetic Polymorphism, Phenotype, Genotype
Data Types phenotype data, association (GWAS) data, summary level analysis data, SRA (Short Read Archive) data, reference alignment (BAM) data, VCF (Variant Call Format) data, expression data, imputed genotype data, image data, etc.
Controlled Access Yes
Data Access Control Description Controlled access
Data Access Control Links 

https://dbgap.ncbi.nlm.nih.gov/aa/wga.cgi?page=login


https://www.ncbi.nlm.nih.gov/books/NBK570247/ 

Fairsharing Link https://fairsharing.org/FAIRsharing.88v2k0  

Return to Navigation 

Table 4: National Metabolomics Data Repository  #table4

Field NameField Definition
Repository Full NameNational Metabolomics Data Repository 
Repository Short NameNMDR, Metabolomics Workbench
Brief DescriptionRepository for metabolomics data and a resource for analytic tools and protocols.
Web Addresshttps://www.metabolomicsworkbench.org/
Help Email[email protected]  
AffiliationUC San Diego, National Institutes of Health Common Fund
OrganismHuman subjects
Research AreasMetabolomics for small and large studies on cells, tissues and organisms
Data Types Processed data (measurements) maybe in the form of quantitated metabolite concentrations, MS peak height/area values, LC retention times, NMR binned areas, etc. Raw data in the form of MS and NMR binary files and associated parameter files may also be uploaded.
Controlled Access No
Data Access Control Description Open-access enclave
Data Access Control Links N/A
Fairsharing Link https://fairsharing.org/FAIRsharing.xfrgsf

Return to Navigation 

Table 5: Human Health Exposure Analysis Resource Data Center #table5

Field NameField Definition
Repository Full NameHuman Health Exposure Analysis Resource Data Center
Repository Short NameHHEAR Data Center
Brief DescriptionA large, de-identified data repository of epidemiologic and environmental exposure biomarker data including studies with relevant autism and neurodevelopmental outcomes.
Web Addresshttps://hheardatacenter.mssm.edu/ 
Help Email[email protected]
AffiliationIcahn School of Medicine at Mount Sinai; National Institute of Environmental Health Sciences.
OrganismHuman subjects
Research AreasClinical Studies, Public Health, Epidemiology, Exposure, Environmental Health
Data Types Biomarker measurements
Controlled Access Yes
Data Access Control Description Registration required
Data Access Control Links  
Fairsharing Link https://fairsharing.org/FAIRsharing.88v2k0

Return to Navigation 

Table 6: NHGRI Analysis Visualization and Informatics Lab-space  #table6

Field NameField Definition
Repository Full NameNHGRI Analysis Visualization and Informatics Lab-space 
Repository Short NameAnVIL
Brief DescriptionA unified computing environment for genomics data storage, management, and analysis of genomics and related data. It enables population-scale analysis, and facilitates collaboration through the sharing of data, code, and analysis results. The core data management and analysis components of the AnVIL currently consists of Terra, Gen3, Galaxy, RStudio/Bioconductor, Dockstore, and Jupyter. 
Web Addresshttps://anvilproject.org/ 
Help Email[email protected]
AffiliationNational Human Genome Research Institute
OrganismHuman subjects
Research AreasGenomics
Data Types Biomarker measurements
Controlled Access Yes
Data Access Control Description Registration required
Data Access Control Links https://anvilproject.org/faq/data-security  
Fairsharing Link https://fairsharing.org/FAIRsharing.88v2k0  

Return to Navigation 

Table 7: National Longitudinal Transition Study-2 #table7

Field NameField Definition
Repository Full NameNational Longitudinal Transition Study-2
Repository Short NameNLTS2
Brief DescriptionThe National Longitudinal Transition Study-2 was commissioned by the US Department of Education and documented experiences of students aged 13-16, as they moved from secondary school into adult roles. The NLTS2 includes data on secondary school experiences of youth in special education, including their schools, school programs, related services, and extracurricular activities and measures outcomes in education, employment, social, and residential domains including factors that contribute to more positive outcomes.
Web Addresshttps://nlts2.sri.com/ 
Help Email[email protected]
AffiliationDepartment of Education
OrganismHuman subjects
Research AreasSpecial Education
Data Types Parent/youth interview, school survey, student assessment data, demographic data, household characteristics
Controlled Access No; Yes
Data Access Control Description Open Access; Controlled Access
Data Access Control Links  https://nces.ed.gov/statprog/rudman/  
Fairsharing Link N/A

Return to Navigation 

Table 8: National Survey of Children’s Health  #table8

Field NameField Definition
Repository Full NameNational Survey of Children’s Health
Repository Short NameNSCH
Brief DescriptionThe NSCH is funded by the Health Resources & Services Administration (HRSA) and supports national efforts to improve the health and development of children. National and state level data are released annually and focus on key measures of child health and well-being to understand the health status and health services needs of children across the nation. Data from the Children with Special Health Care Needs (CSHCN) are also included and explores the extent to which children with spcial health care needs have medical homes, adequate health insurance, access to needed services, and adequate care coordination. Other topics include functional difficulties, transition services, shared decision-making, and satisfaction with care.
Web Addresshttps://mchb.hrsa.gov/data-research/national-survey-childrens-health  
Help Email[email protected]
AffiliationHealth Resources & Services Administration (HRSA)
OrganismHuman subjects
Research AreasPhysical and emotional health of children, access to and use of health care, family interactions, parental health, school and after-school experiences, neighborhood characteristics
Data Types  
Controlled Access No
Data Access Control Description Open; Registration Required
Data Access Control Links 

https://www.census.gov/programs-surveys/nsch/data/datasets.html 

https://www.childhealthdata.org/dataset

Fairsharing Link N/A 

Return to Navigation 

Table 9: Medical Expenditure Panel Survey #table9

Field NameField Definition
Repository Full NameMedical Expenditure Panel Survey
Repository Short NameMEPS
Brief DescriptionFunded by the Agency for Healthcare Research and Quality, the MEPS is a set of large-scale surveys of families and individuals, their medical providers, and employers across the United States. MEPS is the most complete source of data on the cost and use of health care and health insurance coverage. The Household Component provides data from individual households and their members, which is supplemented by data from their medical providers. The Insurance Component is a separate survey of employers that provides data on employer-based health insurance.
Web Addresshttps://meps.ahrq.gov/mepsweb/ 
Help Email[email protected] 
AffiliationAgency for Healthcare Research and Quality
OrganismHuman subjects
Research AreasAccess to health care, Children’s Health, Men’s Health, Women’s Health, Elderly Health, Insurance, Disability, Minority Health, employment, Health Care Disparities, Home Health Care, Employment, Injuries, Mental Health, Obesity, Opioids, Pharmacy & Prescription Drugs, Preventative Care, Preventative Care, Arthritis, Asthma, Cancer, Diabetes, Emphysema and Bronchitis, Heart Conditions, High Blood Pressure, High Cholesterol, Strokes, Quality of Health Care, Veteran’s Health, Vision Impairment, Health expenditures
Data Types  
Controlled Access No; Yes
Data Access Control Description Open access; Enclave
Data Access Control Links https://meps.ahrq.gov/mepsweb/data_stats/onsite_datacenter.jsp 
Fairsharing Link N/A

Return to Navigation 

Table 10: Medicaid and the Children’s Health Insurance Program #table10

Field NameField Definition
Repository Full NameMedicaid and the Children’s Health Insurance Program Open Data
Repository Short NameMedicaid & CHIP Open Data
Brief DescriptionData.Medicaid.gov is a public platform offering open access to a diverse range of datasets related to Medicaid and the Children’s Health Insurance Program (CHIP). It is tailored to support policymakers, researchers, and the general public by providing critical data for research, reporting, and analysis. The platform covers various topics, including state Medicaid and CHIP programs, enrollment statistics, spending trends, and quality metrics.
Web Addresshttps://data.medicaid.gov  
Help Email[email protected]
AffiliationU.S. Centers for Medicare & Medicaid Services
OrganismHuman subjects
Research AreasDrug utilization, drug pricing and payment, enrollment, reimbursements, behavioral health care, demographics, maternal health, mental health, disability, dental health, telehealth, substance use disorder, managed care
Data Types  
Controlled Access No
Data Access Control Description Open
Data Access Control Links N/A
Fairsharing Link N/A

Return to Navigation 

Table 11: Kaiser Permanente Research Bank #table11

Field NameField Definition
Repository Full NameKaiser Permanente Research Bank
Repository Short NameKP Research Bank
Brief DescriptionThe KP Research Bank includes robust data and specimen collection from members of a real-world health system, including genomic data resources. The retrospective, longitudinal medical records available include over 440K participants recruited through multiple outreach efforts since 2008, and extends more than 20 years for the majority of the cohort. Researchers can apply to use this resource tailored to their specific study design.
Web Addresshttps://researchbank.kaiserpermanente.org/for-researchers/  
Help Emailhttps://researchbank-econsent.kaiserpermanente.org/ContactUs/Index?ref=noreferrer&lang=en  
AffiliationKaiser Permanente
OrganismHuman subjects
Research AreasGeneral health, cancer, pregnancy, 
Data Types biospecimens, genomic data, self-reported health survey data, and KP clinical data
Controlled Access Yes
Data Access Control Description Controlled access
Data Access Control Links https://researchbank.kaiserpermanente.org/for-researchers/apply-for-access/  
Fairsharing Link N/A

Return to Navigation 

Table 12: Autism Speaks MSSNG Database #table12

Field NameField Definition
Repository Full NameAutism Speaks MSSNG Database
Repository Short NameMSSNG
Brief DescriptionThe MSSNG project aims to create a whole genome sequencing database on autism with deep phenotyping, with a focus on identifying subtypes of autism to inform diagnostics and personalized treatments. Data are available upon request.  
Web Addresshttps://research.mss.ng/  
Help Email[email protected]  
AffiliationAutism Speaks, Verily, DNAstack, Hospital for Sick Children (SickKids)
OrganismHuman subjects
Research AreasAutism
Data Types Genomic data, phenotypic data
Controlled Access Yes
Data Access Control Description Controlled access
Data Access Control Links 

https://autismspeaks.fluxx.io/ 
 

https://research.mss.ng/assets/documents/db7/genomics-application-process_2.5.2025.docx 

Fairsharing Link N/A

Return to Navigation 

Table 13: Simons Foundation Autism Research Initiative (SFARI) Base #table13

Field NameField Definition
Repository Full NameSimons Foundation Autism Research Initiative Base
Repository Short NameSFARI Base
Brief DescriptionSFARI Base is a clearinghouse for autism and autism-related research data and biospecimens supported by the Simons Foundation Autism Research Initiative (SFARI). It includes the Simons Simplex Collection, a permanent repository of genetic samples from 2,700 simplex families; Simons Foundation Powering Autism Research (SPARK), a collection of medical and behavioral information for over 100,000 people with autism; and The Autism Inpatient Collection (AIC), which includes phenotypic and genetic data from 1,555  children with a clinical diagnosis of autism who have been admitted to one of six specialized inpatient child psychiatry units in the United States. Researchers can request access to phenotypic, genetic, or imaging data and order biospecimens.
Web Addresshttps://www.sfari.org/resource/sfari-base/  
Help Email

[email protected] (application process)


[email protected] (Autism BrainNet tissue request process)

AffiliationSimons Foundation Autism Research Initiative
OrganismHuman subjects
Research AreasAutism
Data Types Research data and biospecimens
Controlled Access Yes
Data Access Control Description Controlled Access
Data Access Control Links https://base.sfari.org/  
Fairsharing Link N/A

Return to Navigation 

Table 14: National COVID Cohort Collaborative (N3C) Data Enclave #table14

Field NameField Definition
Repository Full NameNational COVID Cohort Collaborative
Repository Short NameN3C
Brief DescriptionThe N3C Data Enclave is a secure platform through which harmonized clinical data provided by our contributing members are stored. The Enclave includes demographic and clinical characteristics of patients who have been tested for or diagnosed with COVID-19, and further information about the strategies and outcomes of treatments for those suspected or confirmed to have the virus. Additional data from publicly available datasets, claims data, and mortality is also available to support studies. For more information on the inclusion and exclusion criteria, see the N3C Phenotype.
Web Addresshttps://covid.cd2h.org/  
Help Emailhttps://covid.cd2h.org/support/  
AffiliationNational Center for Advancing Translational Sciences (NCATS), National Institutes of Health
OrganismHuman subjects; Sars-cov-2
Research AreasClinical Studies, Medical Virology, Public Health, Patient Care, Cardiovascular Disease, Diabetes & Obesity, Environmental Health, Immunocompromised or Compromised (ISC), Oncology, Rural Health, Social Drivers of Health
Data Types Clinical data 
Controlled Access Yes
Data Access Control Description Enclave
Data Access Control Links https://covid.cd2h.org/account-instructions/  
Fairsharing Link https://fairsharing.org/FAIRsharing.bbbffe  

Return to Navigation 

Table 15: All of Us Research Hub #table15

Field NameField Definition
Repository Full NameAll of Us Research Hub 
Repository Short NameN/A
Brief DescriptionThe All of Us Research Hub houses a large and comprehensive dataset where users can explore aggregate data including genomic variants, survey responses, physical measurements, electronic health record information, and wearables data. Registered users can use the Researcher Workbench to analyze Registered and Controlled tier data with a variety of cloud-based analysis tools.
Web Addresshttps://www.researchallofus.org/  
Help Email[email protected] 
AffiliationNational Institute of Health
OrganismHuman subjects
Research Areasgeneral health, social factors, health care access and utilization, drug exposures, chronic disease, health behavior, genomics
Data Types Research data, survey data, genomics data, Electronic Health Records (EHR) data, self-reported physical measurements, digital health data
Controlled Access Yes
Data Access Control Description Open; Registration Required; Controlled Access; Enclave. There are multiple access tiers with access controls that accord with the risk of the data within a given tier.
Data Access Control Links https://support.researchallofus.org/hc/en-us/categories/8951135815700-Access-DURA-Support 
Fairsharing Link N/A

Return to Navigation 

Table 16: PEDSnet: A pediatric learning health system #table16

Field NameField Definition
Repository Full NamePEDSnet: A pediatric learning health system
Repository Short NamePEDSnet
Brief DescriptionPEDSnet contains demographic and clinical data from over 15,000,000 pediatric patients across the United States. The system aligns information from outpatient, inpatient, and emergency department visits to a common data model and makes them available to authorized users through a secure data portal.
Web Addresshttps://pedsnet.org/database/ 
Help Email[email protected]
AffiliationPEDSnet (a Clinical Research Network within PCORnet)
OrganismHuman subjects
Research AreasDemographics, Diagnoses, Medications, Lab Measurements, Procedures, Providers, Visits
Data Types EHR, research data
Controlled Access Yes
Data Access Control Description Controlled Access
Data Access Control Links https://pedsnet.org/database/access-to-data/  
Fairsharing Link N/A

Return to Navigation 

Table 17: UK Biobank #table17

Field NameField Definition
Repository Full NameUK Biobank
Repository Short NameUK Biobank
Brief DescriptionUK Biobank is a large-scale biomedical database and research resource, containing in-depth, de-identified genetic and health information from half a million UK participants. The database, which is regularly augmented with additional data, is globally accessible to approved researchers and scientists undertaking vital research into the most common and life-threatening diseases.  UK Biobank provides data on half a million people ages 40-69 living in the UK. 
Web Addresshttps://www.ukbiobank.ac.uk/  
Help Email[email protected]  
AffiliationWellcome Trust, Medical Research Council, Department of Health, Scottish Government, and the Northwest Regional Development Agency
OrganismHuman subjects
Research AreasResearch areas involving human health and disease
Data Types Electronic Health Records, Surveys and Questionnaires, Research visit, Wearable Fitness Device, Genomic, Registry, Imaging, Genetics, Health linkages, Biomarkers, Baseline assessments
Controlled Access Yes
Data Access Control Description Registration required
Data Access Control Links 

https://www.ukbiobank.ac.uk/enable-your-research/register

 
https://ams.ukbiobank.ac.uk/ams/signup

Fairsharing Link N/A 

Return to Navigation 

 

 

This page last reviewed on