Changes for page Methodology
Last modified by manuelmenendez on 2025/03/14 08:31
From version 5.1
edited by manuelmenendez
on 2025/01/29 19:11
on 2025/01/29 19:11
Change comment:
There is no comment for this version
To version 20.1
edited by manuelmenendez
on 2025/02/14 14:47
on 2025/02/14 14:47
Change comment:
There is no comment for this version
Summary
-
Page properties (1 modified, 0 added, 0 removed)
Details
- Page properties
-
- Content
-
... ... @@ -1,109 +1,146 @@ 1 - ===**Overview**===1 +Here is the updated **Methodology** section for the EBRAINS Wiki, incorporating the **Generalized Neuro Biomarker Ontology Categorization (Neuromarker)** for **biomarker classification across all neurodegenerative diseases**. 2 2 3 -This section describes the step-by-step process used in the **Neurodiagnoses** project to develop a novel diagnostic framework for neurological diseases. The methodology integrates artificial intelligence (AI), biomedical ontologies, and computational neuroscience to create a structured, interpretable, and scalable diagnostic system. 4 - 5 5 ---- 6 6 7 -== =**1. Data Integration** ===5 +== **Neurodiagnoses AI: Multimodal AI for Neurodiagnostic Predictions** == 8 8 9 -=== =**DataSources** ====7 +=== **Project Overview** === 10 10 11 -* **Biomedical Ontologies**: 12 -** Human Phenotype Ontology (HPO) for phenotypic abnormalities. 13 -** Gene Ontology (GO) for molecular and cellular processes. 14 -* **Neuroimaging Datasets**: 15 -** Example: Alzheimer’s Disease Neuroimaging Initiative (ADNI), OpenNeuro. 16 -* **Clinical and Biomarker Data**: 17 -** Anonymized clinical reports, molecular biomarkers, and test results. 9 +Neurodiagnoses AI implements **AI-driven diagnostic and prognostic models** for central nervous system (CNS) disorders, expanding the **Florey Dementia Index (FDI) methodology** to a broader set of neurological conditions. The approach integrates **multimodal data sources** (EEG, neuroimaging, biomarkers, and genetics) and employs machine learning models to provide **explainable, real-time diagnostic insights**. This framework now incorporates **Neuromarker**, a **generalized biomarker ontology** that categorizes biomarkers across neurodegenerative diseases, enabling **standardized, cross-disease AI training**. 18 18 11 +== **Neuromarker: Generalized Biomarker Ontology** == 19 19 20 - ====**DataPreprocessing**====13 +Neuromarker extends the **Common Alzheimer’s Disease Research Ontology (CADRO)** into a **cross-disease biomarker categorization framework** applicable to all neurodegenerative diseases (NDDs). It allows for **standardized classification, AI-based feature extraction, and multimodal integration**. 21 21 22 -1. **Standardization**: Ensure all data sources are normalized to a common format. 23 -1. **Feature Selection**: Identify relevant features for diagnosis (e.g., biomarkers, imaging scores). 24 -1. **Data Cleaning**: Handle missing values and remove duplicates. 15 +=== **Core Biomarker Categories** === 25 25 17 +The following ontology is used within **Neurodiagnoses AI** for biomarker categorization: 18 + 19 +|=**Category**|=**Description** 20 +|**Molecular Biomarkers**|Omics-based markers (genomic, transcriptomic, proteomic, metabolomic, lipidomic) 21 +|**Neuroimaging Biomarkers**|Structural (MRI, CT), Functional (fMRI, PET), Molecular Imaging (tau, amyloid, α-synuclein) 22 +|**Fluid Biomarkers**|CSF, plasma, blood-based markers for tau, amyloid, α-synuclein, TDP-43, GFAP, NfL 23 +|**Neurophysiological Biomarkers**|EEG, MEG, evoked potentials (ERP), sleep-related markers 24 +|**Digital Biomarkers**|Gait analysis, cognitive/speech biomarkers, wearables data, EHR-based markers 25 +|**Clinical Phenotypic Markers**|Standardized clinical scores (MMSE, MoCA, CDR, UPDRS, ALSFRS, UHDRS) 26 +|**Genetic Biomarkers**|Risk alleles (APOE, LRRK2, MAPT, C9orf72, PRNP) and polygenic risk scores 27 +|**Environmental & Lifestyle Factors**|Toxins, infections, diet, microbiome, comorbidities 28 + 26 26 ---- 27 27 28 -== =**2.AI-BasedAnalysis** ===31 +== **How to Use External Databases in Neurodiagnoses** == 29 29 30 - ====**ModelDevelopment**====33 +To enhance diagnostic accuracy, Neurodiagnoses AI integrates data from **multiple biomedical and neurological research databases**. Researchers can follow these steps to access, prepare, and integrate data into the Neurodiagnoses framework. 31 31 32 -* **Embedding Models**: Use pre-trained models like BioBERT or BioLORD for text data. 33 -* **Classification Models**: 34 -** Algorithms: Random Forest, Support Vector Machines (SVM), or neural networks. 35 -** Purpose: Predict the likelihood of specific neurological conditions based on input data. 35 +=== **Potential Data Sources** === 36 36 37 - ==== **DimensionalityReductionandInterpretability**====37 +Neurodiagnoses maintains an **updated list** of biomedical datasets relevant to neurodegenerative diseases: 38 38 39 -* Leverage [[DEIBO>>https://drive.ebrains.eu/f/8d7157708cde4b258db0/]] (Data-driven Embedding Interpretation Based on Ontologies) to connect model dimensions to ontology concepts. 40 -* Evaluate interpretability using metrics like the Area Under the Interpretability Curve (AUIC). 39 +* **ADNI**: Alzheimer's Disease Imaging & Biomarkers → [[ADNI>>url:https://adni.loni.usc.edu/]] 40 +* **PPMI**: Parkinson’s Disease Imaging & Biospecimens → [[PPMI>>url:https://www.ppmi-info.org/]] 41 +* **GP2**: Whole-Genome Sequencing for PD → [[GP2>>url:https://gp2.org/]] 42 +* **Enroll-HD**: Huntington’s Disease Clinical & Genetic Data → [[Enroll-HD>>url:https://www.enroll-hd.org/]] 43 +* **GAAIN**: Multi-Source Alzheimer’s Data Aggregation → [[GAAIN>>url:https://gaain.org/]] 44 +* **UK Biobank**: Population-Wide Genetic, Imaging & Health Records → [[UK Biobank>>url:https://www.ukbiobank.ac.uk/]] 45 +* **DPUK**: Dementia & Aging Data → [[DPUK>>url:https://www.dementiasplatform.uk/]] 46 +* **PRION Registry**: Prion Diseases Clinical & Genetic Data → [[PRION Registry>>url:https://prionregistry.org/]] 47 +* **DECIPHER**: Rare Genetic Disorder Genomic Variants → [[DECIPHER>>url:https://decipher.sanger.ac.uk/]] 41 41 42 42 ---- 43 43 44 -== =**3.Diagnostic Framework** ===51 +== **1. Register for Access** == 45 45 46 -==== **Axes of Diagnosis** ==== 53 +* Each external database requires **individual registration and access approval**. 54 +* Ensure compliance with **ethical approvals and data usage agreements** before integrating datasets into Neurodiagnoses. 55 +* Some repositories may require a **Data Usage Agreement (DUA)** for sensitive medical data. 47 47 48 - The framework organizes diagnostic data into three axes:57 +---- 49 49 50 -1. **Etiology**: Genetic and environmental risk factors. 51 -1. **Molecular Markers**: Biomarkers such as amyloid-beta, tau, and alpha-synuclein. 52 -1. **Neuroanatomical Correlations**: Results from neuroimaging (e.g., MRI, PET). 59 +== **2. Download & Prepare Data** == 53 53 54 -==== **Recommendation System** ==== 61 +* Download datasets while adhering to **database usage policies**. 62 +* Ensure files meet **Neurodiagnoses format requirements**: 55 55 56 -* Suggests additional tests or biomarkers if gaps are detected in the data. 57 -* Prioritizes tests based on clinical impact and cost-effectiveness. 64 +|=**Data Type**|=**Accepted Formats** 65 +|**Tabular Data**|.csv, .tsv 66 +|**Neuroimaging**|.nii, .dcm 67 +|**Genomic Data**|.fasta, .vcf 68 +|**Clinical Metadata**|.json, .xml 58 58 70 +* **Mandatory Fields for Integration**: 71 +** **Subject ID**: Unique patient identifier 72 +** **Diagnosis**: Standardized disease classification 73 +** **Biomarkers**: CSF, plasma, or imaging biomarkers 74 +** **Genetic Data**: Whole-genome or exome sequencing 75 +** **Neuroimaging Metadata**: MRI/PET acquisition parameters 76 + 59 59 ---- 60 60 61 -== =**4.ComputationalWorkflow** ===79 +== **3. Upload Data to Neurodiagnoses** == 62 62 63 -1. **Data Loading**: Import data from storage (Drive or Bucket). 64 -1. **Feature Engineering**: Generate derived features from the raw data. 65 -1. **Model Training**: 66 -1*. Split data into training, validation, and test sets. 67 -1*. Train models with cross-validation to ensure robustness. 68 -1. **Evaluation**: 69 -1*. Metrics: Accuracy, F1-Score, AUIC for interpretability. 70 -1*. Compare against baseline models and domain benchmarks. 81 +=== **Option 1: Upload to EBRAINS Bucket** === 71 71 83 +* Location: **EBRAINS Neurodiagnoses Bucket** 84 +* Ensure **correct metadata tagging** before submission. 85 + 86 +=== **Option 2: Contribute via GitHub Repository** === 87 + 88 +* Location: **GitHub Data Repository** 89 +* Create a **new folder under /data/** and include a **dataset description**. 90 +* **For large datasets**, contact project administrators before uploading. 91 + 72 72 ---- 73 73 74 -== =**5.Validation** ===94 +== **4. Integrate Data into AI Models** == 75 75 76 -==== **Internal Validation** ==== 96 +* Open **Jupyter Notebooks** on EBRAINS to run **preprocessing scripts**. 97 +* **Standardize neuroimaging and biomarker formats** using harmonization tools. 98 +* Use **machine learning models** to handle **missing data** and **feature extraction**. 99 +* Train AI models with **newly integrated patient cohorts**. 77 77 78 -* Test the system using simulated datasets and known clinical cases. 79 -* Fine-tune models based on validation results. 101 +**Reference**: See docs/data_processing.md for detailed instructions. 80 80 81 - ==== **External Validation** ====103 +---- 82 82 83 -* Collaborate with research institutions and hospitals to test the system in real-world settings. 84 -* Use anonymized patient data to ensure privacy compliance. 105 +== **AI-Driven Biomarker Categorization** == 85 85 107 +Neurodiagnoses employs **AI models** for biomarker classification: 108 + 109 +|=**Model Type**|=**Application** 110 +|**Graph Neural Networks (GNNs)**|Identify shared biomarker pathways across diseases 111 +|**Contrastive Learning**|Distinguish overlapping vs. unique biomarkers 112 +|**Multimodal Transformer Models**|Integrate imaging, omics, and clinical data 113 + 86 86 ---- 87 87 88 -== =**6.Collaborative Development** ===116 +== **Collaboration & Partnerships** == 89 89 90 - Theprojectis opento contributionsfrom researchers,clinicians,anddevelopers. Key toolsinclude:118 +=== **Partnering with Data Providers** === 91 91 92 -* **Jupyter Notebooks**: For data analysis and pipeline development. 93 -** Example: [[probabilistic imputation>>https://drive.ebrains.eu/f/4f69ab52f7734ef48217/]] 94 -* **Wiki Pages**: For documenting methods and results. 95 -* **Drive and Bucket**: For sharing code, data, and outputs. 96 -* **Collaboration with related projects: **For instance: [[//Beyond the hype: AI in dementia – from early risk detection to disease treatment//>>https://www.lethe-project.eu/beyond-the-hype-ai-in-dementia-from-early-risk-detection-to-disease-treatment/]] 120 +Neurodiagnoses seeks partnerships with data repositories to: 97 97 122 +* Enable **API-based data integration** for real-time processing. 123 +* Co-develop **harmonized AI-ready datasets** with standardized annotations. 124 +* Secure **funding opportunities** through joint grant applications. 125 + 126 +**Interested in Partnering?** 127 + 128 +* If you represent a **research consortium or database provider**, reach out to explore **data-sharing agreements**. 129 +* **Contact**: [[info@neurodiagnoses.com>>mailto:info@neurodiagnoses.com]] 130 + 98 98 ---- 99 99 100 -== =**7. Toolsand Technologies** ===133 +== **Final Notes** == 101 101 102 -* **Programming Languages**: Python for AI and data processing. 103 -* **Frameworks**: 104 -** TensorFlow and PyTorch for machine learning. 105 -** Flask or FastAPI for backend services. 106 -* **Visualization**: Plotly and Matplotlib for interactive and static visualizations. 107 -* **EBRAINS Services**: 108 -** Collaboratory Lab for running Notebooks. 109 -** Buckets for storing large datasets. 135 +Neurodiagnoses continuously expands its **data ecosystem** to support **AI-driven clinical decision-making**. Researchers and institutions are encouraged to **contribute new datasets and methodologies**. 136 + 137 +**For additional technical documentation**: 138 + 139 +* **GitHub Repository**: [[Neurodiagnoses GitHub>>url:https://github.com/neurodiagnoses]] 140 +* **EBRAINS Collaboration Page**: [[EBRAINS Neurodiagnoses>>url:https://ebrains.eu/collabs/neurodiagnoses]] 141 + 142 +**If you experience issues integrating data**, open a **GitHub Issue** or consult the **EBRAINS Neurodiagnoses Forum**. 143 + 144 +---- 145 + 146 +This **updated methodology** now incorporates [[https:~~/~~/github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/data/biomarker_ontology>>https://Neuromarker]] for standardized biomarker classification, enabling **cross-disease AI training** across neurodegenerative disorders.