Methodology

Last modified by manuelmenendez on 2025/03/14 08:31

Neurodiagnoses AI is an open-source, AI-driven framework designed to enhance the diagnosis and prognosis of central nervous system (CNS) disorders. It encompasses a broader spectrum of neurological conditions. The system integrates multimodal data sources—including EEG, neuroimaging, biomarkers, and genetics—and employs machine learning models to deliver explainable, real-time diagnostic insights. A key feature of this framework is the incorporation of the Generalized Neuro Biomarker Ontology Categorization (Neuromarker) and Disease Knowledge Transfer (DKT), which standardizes disease and biomarker classification across all CNS diseases, facilitating cross-disease AI training.

Neuromarker: Generalized Biomarker Ontology

Neuromarker extends the Common Alzheimer’s Disease Research Ontology (CADRO) into a comprehensive biomarker categorization framework applicable to all neurodegenerative diseases (NDDs). This ontology enables standardized classification, AI-based feature extraction, and seamless multimodal data integration.

Recommended Software

There is a suite of software that can help implement the workflow needed in Neurodiagnoses. Find a list of recommendations here.

Core Biomarker Categories

Within the Neurodiagnoses AI framework, biomarkers are categorized as follows:

CategoryDescription
Molecular BiomarkersOmics-based markers (genomic, transcriptomic, proteomic, metabolomic, lipidomic)
Neuroimaging BiomarkersStructural (MRI, CT), Functional (fMRI, PET), Molecular Imaging (tau, amyloid, α-synuclein)
Fluid BiomarkersCSF, plasma, blood-based markers for tau, amyloid, α-synuclein, TDP-43, GFAP, NfL, autoantiboides
Neurophysiological BiomarkersEEG, MEG, evoked potentials (ERP), sleep-related markers
Digital BiomarkersGait analysis, cognitive/speech biomarkers, wearables data, EHR-based markers
Clinical Phenotypic MarkersStandardized clinical scores (MMSE, MoCA, CDR, UPDRS, ALSFRS, UHDRS)
Genetic BiomarkersRisk alleles (APOE, LRRK2, MAPT, C9orf72, PRNP) and polygenic risk scores
Environmental & Lifestyle FactorsToxins, infections, diet, microbiome, comorbidities

Integrating External Databases into Neurodiagnoses

To enhance diagnostic precision, Neurodiagnoses AI incorporates data from multiple biomedical and neurological research databases. Researchers can integrate external datasets by following these steps:

  1. Register for Access

    • Each external database requires individual registration and access approval.
    • Ensure compliance with ethical approvals and data usage agreements before integrating datasets into Neurodiagnoses.
    • Some repositories may require a Data Usage Agreement (DUA) for sensitive medical data.
  2. Download & Prepare Data

    • Download datasets while adhering to database usage policies.
    • Ensure files meet Neurodiagnoses format requirements:

      Data TypeAccepted Formats
      Tabular Data.csv, .tsv
      Neuroimaging.nii, .dcm
      Genomic Data.fasta, .vcf
      Clinical Metadata.json, .xml
    • Mandatory Fields for Integration:

      • Subject ID: Unique patient identifier
      • Diagnosis: Standardized disease classification
      • Biomarkers: CSF, plasma, or imaging biomarkers
      • Genetic Data: Whole-genome or exome sequencing
      • Neuroimaging Metadata: MRI/PET acquisition parameters
  3. Upload Data to Neurodiagnoses

    • Option 1: Upload to EBRAINS Bucket

      • Location: EBRAINS Neurodiagnoses Bucket
      • Ensure correct metadata tagging before submission.
    • Option 2: Contribute via GitHub Repository

      • Location: GitHub Data Repository
      • Create a new folder under /data/ and include a dataset description.
      • For large datasets, contact project administrators before uploading.
  4. Integrate Data into AI Models

    • Open Jupyter Notebooks on EBRAINS to run preprocessing scripts.
    • Standardize neuroimaging and biomarker formats using harmonization tools.
    • Utilize machine learning models to handle missing data and feature extraction.
    • Train AI models with newly integrated patient cohorts.

    Reference: See docs/data_processing.md for detailed instructions.

AI-Driven Biomarker Categorization

Neurodiagnoses employs advanced AI models for biomarker classification:

Model TypeApplication
Graph Neural Networks (GNNs)Identify shared biomarker pathways across diseases
Contrastive LearningDistinguish overlapping vs. unique biomarkers
Multimodal Transformer ModelsIntegrate imaging, omics, and clinical data

Jupyter Integration with EBRAINS

Overview

Neurodiagnoses Open Source leverages Jupyter Notebooks from EBRAINS to facilitate neurodiagnostic research, biomarker analysis, and AI-driven data processing. This integration provides an interactive and reproducible environment for developing machine learning models, analyzing neuroimaging data, and exploring multimodal biomarkers. Jupyter integration in EBRAINS empowers Neurodiagnoses Open Source to: ✅ Analyze MRI, EEG, and biomarker data efficiently. ✅ Train machine learning models with high-performance computing. ✅ Ensure transparency with interactive explainability tools. ✅ Enable collaborative neurodiagnostic research with reproducible notebooks.

Key Capabilities of Jupyter in Neurodiagnoses

1. Neuroimaging Analysis (MRI, fMRI, PET)

  • Preprocessing Pipelines:
    • Use Nipype, NiLearn, ANTs, and FreeSurfer for structural and functional MRI analysis.
    • Skull stripping, segmentation, and registration of MRI scans.
    • Entropy-based slice selection for training deep learning models.
  • Deep Learning for Neuroimaging:
    • Implement CNN-based models (ResNet, VGG16, Autoencoders) for biomarker extraction.
    • Feature-based classification of Alzheimer’s, Parkinson’s, and MCI from neuroimaging data.

2. EEG and MEG Signal Processing

  • Data Preprocessing & Artifact Removal:
    • Use MNE-Python for filtering, ICA-based artifact rejection, and time-series normalization.
    • Extract frequency and time-domain features from EEG/MEG signals.
  • Feature Engineering & Connectivity Analysis:
    • Functional connectivity analysis using coherence and phase synchronization metrics.
    • Graph-theory-based EEG biomarkers for neurodegenerative disease classification.
  • Deep Learning for EEG Analysis:
    • Train LSTMs and CNNs for automatic EEG-based classification of MCI and cognitive decline.

3. Machine Learning for Biomarker Discovery

  • SHAP-based Explainability for Biomarkers:
    • Use Random Forest + SHAP to rank the most predictive CSF, blood, and imaging biomarkers.
    • Generate SHAP summary plots to interpret the impact of individual biomarkers.
  • Multi-Modal Feature Selection:
    • Implement Anchor-Graph Feature Selection to combine MRI, EEG, and CSF data.
    • PCA and autoencoders for dimensionality reduction and feature extraction.
  • Automated Risk Prediction Models:
    • Train ensemble models combining deep learning and classical ML algorithms.
    • Apply subject-level cross-validation to prevent data leakage and ensure reproducibility.

4. Computational Simulations & Virtual Brain Models

  • Integration with The Virtual Brain (TVB):
    • Simulate large-scale brain networks based on individual neuroimaging data.
    • Model the effect of neurodegenerative progression on brain activity.
  • Cortical and Subcortical Connectivity Analysis:
    • Generate connectivity matrices using diffusion MRI and functional MRI correlations.
    • Validate computational simulations with real patient data from EBRAINS datasets.

5. Interactive Data Visualization & Reporting

  • Dynamic Plots & Dashboards:
    • Use Matplotlib, Seaborn, Plotly for interactive visualizations of biomarkers.
    • Implement real-time MRI slice rendering and EEG signal visualization.
  • Automated Report Generation:
    • Generate Jupyter-based PDF reports summarizing key findings.
    • Export analysis results in JSON, CSV, and interactive web dashboards.

How to Use Neurodiagnoses with Jupyter in EBRAINS

1. Access EBRAINS Jupyter Environment

  1. Create an EBRAINS account at EBRAINS.eu.
  2. Navigate to the Collaboratory and open the Jupyter Lab interface.
  3. Clone the Neurodiagnoses repository:
git clone https://github.com/neurodiagnoses
cd neurodiagnoses
pip install -r requirements.txt

2. Run Prebuilt Neurodiagnoses Notebooks

  1. Open the notebooks/ directory inside Jupyter.
  2. Run any of the available notebooks:
    • mri_biomarker_analysis.ipynb → Extracts MRI-based biomarkers.
    • eeg_preprocessing.ipynb → Cleans and processes EEG signals.
    • shap_biomarker_explainability.ipynb → Visualizes biomarker importance.
    • disease_risk_prediction.ipynb → Runs ML models for disease classification.

3. Train Custom AI Models on EBRAINS HPC Resources

  • Use EBRAINS GPU and HPC clusters for deep learning training:
from neurodiagnoses.models import train_cnn_model
train_cnn_model(data_path='data/mri/', model_type='ResNet50')
  • Save trained models for deployment:
model.save('models/neurodiagnoses_cnn.h5')

For further developments, contribute to the Neurodiagnoses GitHub Repository.

Collaboration & Partnerships

Neurodiagnoses actively seeks partnerships with data providers to:

  • Enable API-based data integration for real-time processing.
  • Co-develop harmonized AI-ready datasets with standardized annotations.
  • Secure funding opportunities through joint grant applications.

Interested in Partnering?

If you represent a research consortium or database provider, reach out to explore data-sharing agreements.

Contact: info@neurodiagnoses.com

Final Notes

Neurodiagnoses AI is committed to advancing the integration of artificial intelligence in neurodiagnostic processes. By continuously expanding our data ecosystem and incorporating standardized biomarker classifications through the Neuromarker ontology, we aim to enhance cross-disease AI training and improve diagnostic accuracy across neurodegenerative disorders.

We encourage researchers and institutions to contribute new datasets and methodologies to further enrich this collaborative platform. Your participation is vital in driving innovation and fostering a deeper understanding of complex neurological conditions.

For additional technical documentation and collaboration opportunities:

If you encounter any issues during data integration or have suggestions for improvement, please open a GitHub Issue or consult the EBRAINS Neurodiagnoses Forum. Together, we can advance the field of neurodiagnostics and contribute to better patient outcomes.