Methodology

Version 19.1 by manuelmenendez on 2025/02/14 13:57

# Neurodiagnoses AI: Multimodal AI for Neurodiagnostic Predictions

 Project Overview
Neurodiagnoses AI implements AI-driven diagnostic and prognostic models for central nervous system (CNS) disorders, adapting the Florey Dementia Index (FDI) methodology to a broader set of neurological conditions. The approach integrates multimodal data sources (EEG, neuroimaging, biomarkers, and genetics) and employs machine learning models to provide explainable, real-time diagnostic insights.

 How to Use External Databases in Neurodiagnoses
To enhance diagnostic accuracy, Neurodiagnoses integrates data from multiple biomedical and neurological research databases. Researchers can follow these steps to access, prepare, and integrate data into the Neurodiagnoses framework.

# Potential Data Sources
Neurodiagnoses maintains an updated list of potential biomedical databases relevant to neurodegenerative diseases. 

Reference: List of Potential Databases
- ADNI: Alzheimer's Disease data ([ADNI](https://adni.loni.usc.edu))
- PPMI: Parkinson’s Disease Imaging and biospecimens ([PPMI](https://www.ppmi-info.org))
- GP2: Whole-genome sequencing for PD ([GP2](https://gp2.org))
- Enroll-HD: Huntington’s Disease Clinical and genetic data ([Enroll-HD](https://www.enroll-hd.org))
- GAAIN: Multi-source Alzheimer’s data aggregation ([GAAIN](https://gaain.org))
- UK Biobank: Population-wide genetic, imaging, and health records ([UK Biobank](https://www.ukbiobank.ac.uk))
- DPUK: Dementia and Aging data ([DPUK](https://www.dementiasplatform.uk))
- PRION Registry: Prion Diseases clinical and genetic data ([PRION Registry](https://prionregistry.org))
- DECIPHER: Rare genetic disorder genomic variants ([DECIPHER](https://decipher.sanger.ac.uk))

# 1. Register for Access
- Each external database requires individual registration and access approval.
- Ensure compliance with ethical approvals and data usage agreements before integrating datasets into Neurodiagnoses.
- Some repositories may require a Data Usage Agreement (DUA) for sensitive medical data.

# 2. Download & Prepare Data
- Download datasets while adhering to database usage policies.
- Ensure files meet Neurodiagnoses format requirements:
  - Tabular Data: `.csv`, `.tsv`
  - Neuroimaging Data: `.nii`, `.dcm`
  - Genomic Data: `.fasta`, `.vcf`
  - Clinical Metadata: `.json`, `.xml`

- Mandatory Fields for Integration:
  - Subject ID: Unique patient identifier
  - Diagnosis: Standardized disease classification
  - Biomarkers: CSF, plasma, or imaging biomarkers
  - Genetic Data: Whole-genome or exome sequencing
  - Neuroimaging Metadata: MRI/PET acquisition parameters

# 3. Upload Data to Neurodiagnoses
Option 1: Upload to EBRAINS Bucket
- Location: EBRAINS Neurodiagnoses Bucket
- Ensure correct metadata tagging before submission.

 Option 2: Contribute via GitHub Repository
- Location: GitHub Data Repository
- Create a new folder under `/data/` and include a dataset description.
- For large datasets, contact project administrators before uploading.

# 4. Integrate Data into AI Models
- Open Jupyter Notebooks on EBRAINS to run preprocessing scripts.
- Standardize neuroimaging and biomarker formats using harmonization tools.
- Use machine learning models to handle missing data and feature extraction.
- Train AI models with newly integrated patient cohorts.

Reference: See `docs/data_processing.md` for detailed instructions.

 Collaboration & Partnerships
# Partnering with Data Providers
Neurodiagnoses seeks partnerships with data repositories to:
- Enable API-based data integration for real-time processing.
- Co-develop harmonized AI-ready datasets with standardized annotations.
- Secure funding opportunities through joint grant applications.

Interested in Partnering?
- If you represent a research consortium or database provider, reach out to explore data-sharing agreements.
- Contact: info@neurodiagnoses.com

 Final Notes
Neurodiagnoses continuously expands its data ecosystem to support AI-driven clinical decision-making. Researchers and institutions are encouraged to contribute new datasets and methodologies.

For additional technical documentation:
- GitHub Repository: [Neurodiagnoses GitHub](https://github.com/neurodiagnoses)
- EBRAINS Collaboration Page: [EBRAINS Neurodiagnoses](https://ebrains.eu/collabs/neurodiagnoses)

If you experience issues integrating data, open a GitHub Issue or consult the EBRAINS Neurodiagnoses Forum.

How to Use External Databases in Neurodiagnoses

To enhance the accuracy of our diagnostic models, Neurodiagnoses integrates data from multiple biomedical and neurological research databases. If you are a researcher, follow these steps to access, prepare, and integrate data into the Neurodiagnoses framework.

Potential Data Sources

Neurodiagnoses maintains an updated list of potential biomedical databases relevant to neurodegenerative diseases.

1. Register for Access

Each external database requires individual registration and access approval. Follow the official guidelines of each database provider.

  • Ensure that you have completed all ethical approvals and data access agreements before integrating datasets into Neurodiagnoses.
  • Some repositories require a Data Usage Agreement (DUA) before downloading sensitive medical data.

2. Download & Prepare Data

Once access is granted, download datasets while complying with data usage policies. Ensure that the files meet Neurodiagnoses’ format requirements for smooth integration.

Supported File Formats

  • Tabular Data: .csv, .tsv
  • Neuroimaging Data: .nii, .dcm
  • Genomic Data: .fasta, .vcf
  • Clinical Metadata: .json, .xml

Mandatory Fields for Integration

Field NameDescription
Subject IDUnique patient identifier
DiagnosisStandardized disease classification
BiomarkersCSF, plasma, or imaging biomarkers
Genetic DataWhole-genome or exome sequencing
Neuroimaging MetadataMRI/PET acquisition parameters

3. Upload Data to Neurodiagnoses

Once preprocessed, data can be uploaded to EBRAINS or GitHub.

  • Option 1: Upload to EBRAINS Bucket

  • Option 2: Contribute via GitHub Repository

Note: For large datasets, please contact the project administrators before uploading.

4. Integrate Data into AI Models

Once uploaded, datasets must be harmonized and formatted before AI model training.

Steps for Data Integration

  • Open Jupyter Notebooks on EBRAINS to run preprocessing scripts.
  • Standardize neuroimaging and biomarker formats using harmonization tools.
  • Use machine learning models to handle missing data and feature extraction.
  • Train AI models with newly integrated patient cohorts.
  • Reference: Detailed instructions can be found in docs/data_processing.md.

Database Sources Table

Where to Insert This

Key Databases for Neurodiagnoses

DatabaseFocus AreaData TypeAccess Link
ADNIAlzheimer's DiseaseMRI, PET, CSF, cognitive testsADNI
PPMIParkinson’s DiseaseImaging, biospecimensPPMI
GP2Genetic Data for PDWhole-genome sequencingGP2
Enroll-HDHuntington’s DiseaseClinical, genetic, imagingEnroll-HD
GAAINAlzheimer's & Cognitive DeclineMulti-source data aggregationGAAIN
UK BiobankPopulation-wide studiesGenetic, imaging, health recordsUK Biobank
DPUKDementia & AgingImaging, genetics, lifestyle factorsDPUK
PRION RegistryPrion DiseasesClinical and genetic dataPRION Registry
DECIPHERRare Genetic DisordersGenomic variantsDECIPHER

If you know a relevant dataset, submit a proposal in GitHub Issues.


Collaboration & Partnerships

Where to Insert This

Partnering with Data Providers

Beyond using existing datasets, Neurodiagnoses seeks partnerships with data repositories to:

  • Enable direct API-based data integration for real-time processing.
  • Co-develop harmonized AI-ready datasets with standardized annotations.
  • Secure funding opportunities through joint grant applications.

Interested in Partnering?

If you represent a research consortium or database provider, reach out to explore data-sharing agreements.


Final Notes

Neurodiagnoses continuously expands its data ecosystem to support AI-driven clinical decision-making. Researchers and institutions are encouraged to contribute new datasets and methodologies.

For additional technical documentation:

If you experience issues integrating data, open a GitHub Issue or consult the EBRAINS Neurodiagnoses Forum.