Methodology

Version 18.1 by manuelmenendez on 2025/02/13 12:52

Overview

Neurodiagnoses develops a tridimensional diagnostic framework for CNS diseases, incorporating AI-powered annotation tools to improve interpretability, standardization, and clinical utility. The methodology integrates multi-modal data, including genetic, neuroimaging, neurophysiological, and biomarker datasets, and applies machine learning models to generate structured, explainable diagnostic outputs.


How to Use External Databases in Neurodiagnoses

To enhance the accuracy of our diagnostic models, Neurodiagnoses integrates data from multiple biomedical and neurological research databases. If you are a researcher, follow these steps to access, prepare, and integrate data into the Neurodiagnoses framework.

Potential Data Sources

Neurodiagnoses maintains an updated list of potential biomedical databases relevant to neurodegenerative diseases.

1. Register for Access

Each external database requires individual registration and access approval. Follow the official guidelines of each database provider.

  • Ensure that you have completed all ethical approvals and data access agreements before integrating datasets into Neurodiagnoses.
  • Some repositories require a Data Usage Agreement (DUA) before downloading sensitive medical data.

2. Download & Prepare Data

Once access is granted, download datasets while complying with data usage policies. Ensure that the files meet Neurodiagnoses’ format requirements for smooth integration.

Supported File Formats

  • Tabular Data: .csv, .tsv
  • Neuroimaging Data: .nii, .dcm
  • Genomic Data: .fasta, .vcf
  • Clinical Metadata: .json, .xml

Mandatory Fields for Integration

Field NameDescription
Subject IDUnique patient identifier
DiagnosisStandardized disease classification
BiomarkersCSF, plasma, or imaging biomarkers
Genetic DataWhole-genome or exome sequencing
Neuroimaging MetadataMRI/PET acquisition parameters

3. Upload Data to Neurodiagnoses

Once preprocessed, data can be uploaded to EBRAINS or GitHub.

  • Option 1: Upload to EBRAINS Bucket

  • Option 2: Contribute via GitHub Repository

Note: For large datasets, please contact the project administrators before uploading.

4. Integrate Data into AI Models

Once uploaded, datasets must be harmonized and formatted before AI model training.

Steps for Data Integration

  • Open Jupyter Notebooks on EBRAINS to run preprocessing scripts.
  • Standardize neuroimaging and biomarker formats using harmonization tools.
  • Use machine learning models to handle missing data and feature extraction.
  • Train AI models with newly integrated patient cohorts.
  • Reference: Detailed instructions can be found in docs/data_processing.md.

Database Sources Table

Where to Insert This

Key Databases for Neurodiagnoses

DatabaseFocus AreaData TypeAccess Link
ADNIAlzheimer's DiseaseMRI, PET, CSF, cognitive testsADNI
PPMIParkinson’s DiseaseImaging, biospecimensPPMI
GP2Genetic Data for PDWhole-genome sequencingGP2
Enroll-HDHuntington’s DiseaseClinical, genetic, imagingEnroll-HD
GAAINAlzheimer's & Cognitive DeclineMulti-source data aggregationGAAIN
UK BiobankPopulation-wide studiesGenetic, imaging, health recordsUK Biobank
DPUKDementia & AgingImaging, genetics, lifestyle factorsDPUK
PRION RegistryPrion DiseasesClinical and genetic dataPRION Registry
DECIPHERRare Genetic DisordersGenomic variantsDECIPHER

If you know a relevant dataset, submit a proposal in GitHub Issues.


Collaboration & Partnerships

Where to Insert This

Partnering with Data Providers

Beyond using existing datasets, Neurodiagnoses seeks partnerships with data repositories to:

  • Enable direct API-based data integration for real-time processing.
  • Co-develop harmonized AI-ready datasets with standardized annotations.
  • Secure funding opportunities through joint grant applications.

Interested in Partnering?

If you represent a research consortium or database provider, reach out to explore data-sharing agreements.


Final Notes

Neurodiagnoses continuously expands its data ecosystem to support AI-driven clinical decision-making. Researchers and institutions are encouraged to contribute new datasets and methodologies.

For additional technical documentation:

If you experience issues integrating data, open a GitHub Issue or consult the EBRAINS Neurodiagnoses Forum.