Warning:  Due to planned infrastructure maintenance, the EBRAINS Wiki and EBRAINS Support system will be unavailable for up to three days starting Monday, 14 July. During this period, both services will be inaccessible, and any emails sent to the support address will not be received.

Attention: We are currently experiencing some issues with the EBRAINS Drive. Please bear with us as we fix this issue. We apologise for any inconvenience caused.


Changes for page Methodology

Last modified by manuelmenendez on 2025/03/14 08:31

From version 20.1
edited by manuelmenendez
on 2025/02/14 14:47
Change comment: There is no comment for this version
To version 11.1
edited by manuelmenendez
on 2025/02/02 21:31
Change comment: There is no comment for this version

Summary

Details

Page properties
Content
... ... @@ -1,146 +1,189 @@
1 -Here is the updated **Methodology** section for the EBRAINS Wiki, incorporating the **Generalized Neuro Biomarker Ontology Categorization (Neuromarker)** for **biomarker classification across all neurodegenerative diseases**.
1 +==== **Overview** ====
2 2  
3 -----
3 +This project develops a **tridimensional diagnostic framework** for **CNS diseases**, incorporating **AI-powered annotation tools** to improve **interpretability, standardization, and clinical utility**. The methodology integrates **multi-modal data**, including **genetic, neuroimaging, neurophysiological, and biomarker datasets**, and applies **machine learning models** to generate **structured, explainable diagnostic outputs**.
4 4  
5 -== **Neurodiagnoses AI: Multimodal AI for Neurodiagnostic Predictions** ==
5 +=== **Workflow** ===
6 6  
7 -=== **Project Overview** ===
7 +1. (((
8 +**We Use GitHub to [[Store and develop AI models, scripts, and annotation pipelines.>>https://github.com/users/manuelmenendezgonzalez/projects/1/views/1]]**
8 8  
9 -Neurodiagnoses AI implements **AI-driven diagnostic and prognostic models** for central nervous system (CNS) disorders, expanding the **Florey Dementia Index (FDI) methodology** to a broader set of neurological conditions. The approach integrates **multimodal data sources** (EEG, neuroimaging, biomarkers, and genetics) and employs machine learning models to provide **explainable, real-time diagnostic insights**. This framework now incorporates **Neuromarker**, a **generalized biomarker ontology** that categorizes biomarkers across neurodegenerative diseases, enabling **standardized, cross-disease AI training**.
10 +* Create a **GitHub repository** for AI scripts and models.
11 +* Use **GitHub Projects** to manage research milestones.
12 +)))
13 +1. (((
14 +**We Use EBRAINS for Data & Collaboration**
10 10  
11 -== **Neuromarker: Generalized Biomarker Ontology** ==
16 +* Store **biomarker and neuroimaging data** in **EBRAINS Buckets**.
17 +* Run **Jupyter Notebooks** in **EBRAINS Lab** to test AI models.
18 +* Use **EBRAINS Wiki** for structured documentation and research discussion.
19 +)))
12 12  
13 -Neuromarker extends the **Common Alzheimer’s Disease Research Ontology (CADRO)** into a **cross-disease biomarker categorization framework** applicable to all neurodegenerative diseases (NDDs). It allows for **standardized classification, AI-based feature extraction, and multimodal integration**.
21 +----
14 14  
15 -=== **Core Biomarker Categories** ===
23 +=== **1. Data Integration** ===
16 16  
17 -The following ontology is used within **Neurodiagnoses AI** for biomarker categorization:
25 +==== **Data Sources** ====
18 18  
19 -|=**Category**|=**Description**
20 -|**Molecular Biomarkers**|Omics-based markers (genomic, transcriptomic, proteomic, metabolomic, lipidomic)
21 -|**Neuroimaging Biomarkers**|Structural (MRI, CT), Functional (fMRI, PET), Molecular Imaging (tau, amyloid, α-synuclein)
22 -|**Fluid Biomarkers**|CSF, plasma, blood-based markers for tau, amyloid, α-synuclein, TDP-43, GFAP, NfL
23 -|**Neurophysiological Biomarkers**|EEG, MEG, evoked potentials (ERP), sleep-related markers
24 -|**Digital Biomarkers**|Gait analysis, cognitive/speech biomarkers, wearables data, EHR-based markers
25 -|**Clinical Phenotypic Markers**|Standardized clinical scores (MMSE, MoCA, CDR, UPDRS, ALSFRS, UHDRS)
26 -|**Genetic Biomarkers**|Risk alleles (APOE, LRRK2, MAPT, C9orf72, PRNP) and polygenic risk scores
27 -|**Environmental & Lifestyle Factors**|Toxins, infections, diet, microbiome, comorbidities
27 +**Biomedical Ontologies & Databases:**
28 28  
29 -----
29 +* **Human Phenotype Ontology (HPO)** for symptom annotation.
30 +* **Gene Ontology (GO)** for molecular and cellular processes.
30 30  
31 -== **How to Use External Databases in Neurodiagnoses** ==
32 +**Dimensionality Reduction and Interpretability:**
32 32  
33 -To enhance diagnostic accuracy, Neurodiagnoses AI integrates data from **multiple biomedical and neurological research databases**. Researchers can follow these steps to access, prepare, and integrate data into the Neurodiagnoses framework.
34 +* **Evaluate interpretability** using metrics like the **Area Under the Interpretability Curve (AUIC)**.
35 +* **Leverage [[DEIBO>>https://github.com/Mellandd/DEIBO]] (Data-driven Embedding Interpretation Based on Ontologies)** to connect model dimensions to ontology concepts.
34 34  
35 -=== **Potential Data Sources** ===
37 +**Neuroimaging & EEG/MEG Data:**
36 36  
37 -Neurodiagnoses maintains an **updated list** of biomedical datasets relevant to neurodegenerative diseases:
39 +* **MRI volumetric measures** for brain atrophy tracking.
40 +* **EEG functional connectivity patterns** (AI-Mind).
38 38  
39 -* **ADNI**: Alzheimer's Disease Imaging & Biomarkers → [[ADNI>>url:https://adni.loni.usc.edu/]]
40 -* **PPMI**: Parkinson’s Disease Imaging & Biospecimens → [[PPMI>>url:https://www.ppmi-info.org/]]
41 -* **GP2**: Whole-Genome Sequencing for PD → [[GP2>>url:https://gp2.org/]]
42 -* **Enroll-HD**: Huntington’s Disease Clinical & Genetic Data → [[Enroll-HD>>url:https://www.enroll-hd.org/]]
43 -* **GAAIN**: Multi-Source Alzheimer’s Data Aggregation → [[GAAIN>>url:https://gaain.org/]]
44 -* **UK Biobank**: Population-Wide Genetic, Imaging & Health Records → [[UK Biobank>>url:https://www.ukbiobank.ac.uk/]]
45 -* **DPUK**: Dementia & Aging Data → [[DPUK>>url:https://www.dementiasplatform.uk/]]
46 -* **PRION Registry**: Prion Diseases Clinical & Genetic Data → [[PRION Registry>>url:https://prionregistry.org/]]
47 -* **DECIPHER**: Rare Genetic Disorder Genomic Variants → [[DECIPHER>>url:https://decipher.sanger.ac.uk/]]
42 +**Clinical & Biomarker Data:**
48 48  
44 +* **CSF biomarkers** (Amyloid-beta, Tau, Neurofilament Light).
45 +* **Sleep monitoring and actigraphy data** (ADIS).
46 +
47 +**Federated Learning Integration:**
48 +
49 +* **Secure multi-center data harmonization** (PROMINENT).
50 +
49 49  ----
50 50  
51 -== **1. Register for Access** ==
53 +==== **Annotation System for Multi-Modal Data** ====
52 52  
53 -* Each external database requires **individual registration and access approval**.
54 -* Ensure compliance with **ethical approvals and data usage agreements** before integrating datasets into Neurodiagnoses.
55 -* Some repositories may require a **Data Usage Agreement (DUA)** for sensitive medical data.
55 +To ensure **structured integration of diverse datasets**, **Neurodiagnoses** will implement an **AI-driven annotation system**, which will:
56 56  
57 +* **Assign standardized metadata tags** to diagnostic features.
58 +* **Provide contextual explanations** for AI-based classifications.
59 +* **Track temporal disease progression annotations** to identify long-term trends.
60 +
57 57  ----
58 58  
59 -== **2. Download & Prepare Data** ==
63 +=== **2. AI-Based Analysis** ===
60 60  
61 -* Download datasets while adhering to **database usage policies**.
62 -* Ensure files meet **Neurodiagnoses format requirements**:
65 +==== **Machine Learning & Deep Learning Models** ====
63 63  
64 -|=**Data Type**|=**Accepted Formats**
65 -|**Tabular Data**|.csv, .tsv
66 -|**Neuroimaging**|.nii, .dcm
67 -|**Genomic Data**|.fasta, .vcf
68 -|**Clinical Metadata**|.json, .xml
67 +**Risk Prediction Models:**
69 69  
70 -* **Mandatory Fields for Integration**:
71 -** **Subject ID**: Unique patient identifier
72 -** **Diagnosis**: Standardized disease classification
73 -** **Biomarkers**: CSF, plasma, or imaging biomarkers
74 -** **Genetic Data**: Whole-genome or exome sequencing
75 -** **Neuroimaging Metadata**: MRI/PET acquisition parameters
69 +* **LETHE’s cognitive risk prediction model** integrated into the annotation framework.
76 76  
77 -----
71 +**Biomarker Classification & Probabilistic Imputation:**
78 78  
79 -== **3. Upload Data to Neurodiagnoses** ==
73 +* **KNN Imputer** and **Bayesian models** used for handling **missing biomarker data**.
80 80  
81 -=== **Option 1: Upload to EBRAINS Bucket** ===
75 +**Neuroimaging Feature Extraction:**
82 82  
83 -* Location: **EBRAINS Neurodiagnoses Bucket**
84 -* Ensure **correct metadata tagging** before submission.
77 +* **MRI & EEG data** annotated with **neuroanatomical feature labels**.
85 85  
86 -=== **Option 2: Contribute via GitHub Repository** ===
79 +==== **AI-Powered Annotation System** ====
87 87  
88 -* Location: **GitHub Data Repository**
89 -* Create a **new folder under /data/** and include a **dataset description**.
90 -* **For large datasets**, contact project administrators before uploading.
81 +* Uses **SHAP-based interpretability tools** to explain model decisions.
82 +* Generates **automated clinical annotations** in structured reports.
83 +* Links findings to **standardized medical ontologies** (e.g., **SNOMED, HPO**).
91 91  
92 92  ----
93 93  
94 -== **4. Integrate Data into AI Models** ==
87 +=== **3. Diagnostic Framework & Clinical Decision Support** ===
95 95  
96 -* Open **Jupyter Notebooks** on EBRAINS to run **preprocessing scripts**.
97 -* **Standardize neuroimaging and biomarker formats** using harmonization tools.
98 -* Use **machine learning models** to handle **missing data** and **feature extraction**.
99 -* Train AI models with **newly integrated patient cohorts**.
89 +==== **Tridimensional Diagnostic Axes** ====
100 100  
101 -**Reference**: See docs/data_processing.md for detailed instructions.
91 +**Axis 1: Etiology (Pathogenic Mechanisms)**
102 102  
93 +* Classification based on **genetic markers, cellular pathways, and environmental risk factors**.
94 +* **AI-assisted annotation** provides **causal interpretations** for clinical use.
95 +
96 +**Axis 2: Molecular Markers & Biomarkers**
97 +
98 +* **Integration of CSF, blood, and neuroimaging biomarkers**.
99 +* **Structured annotation** highlights **biological pathways linked to diagnosis**.
100 +
101 +**Axis 3: Neuroanatomoclinical Correlations**
102 +
103 +* **MRI and EEG data** provide anatomical and functional insights.
104 +* **AI-generated progression maps** annotate **brain structure-function relationships**.
105 +
103 103  ----
104 104  
105 -== **AI-Driven Biomarker Categorization** ==
108 +=== **4. Computational Workflow & Annotation Pipelines** ===
106 106  
107 -Neurodiagnoses employs **AI models** for biomarker classification:
110 +==== **Data Processing Steps** ====
108 108  
109 -|=**Model Type**|=**Application**
110 -|**Graph Neural Networks (GNNs)**|Identify shared biomarker pathways across diseases
111 -|**Contrastive Learning**|Distinguish overlapping vs. unique biomarkers
112 -|**Multimodal Transformer Models**|Integrate imaging, omics, and clinical data
112 +**Data Ingestion:**
113 113  
114 +* **Harmonized datasets** stored in **EBRAINS Bucket**.
115 +* **Preprocessing pipelines** clean and standardize data.
116 +
117 +**Feature Engineering:**
118 +
119 +* **AI models** extract **clinically relevant patterns** from **EEG, MRI, and biomarkers**.
120 +
121 +**AI-Generated Annotations:**
122 +
123 +* **Automated tagging** of diagnostic features in **structured reports**.
124 +* **Explainability modules (SHAP, LIME)** ensure transparency in predictions.
125 +
126 +**Clinical Decision Support Integration:**
127 +
128 +* **AI-annotated findings** fed into **interactive dashboards**.
129 +* **Clinicians can adjust, validate, and modify annotations**.
130 +
114 114  ----
115 115  
116 -== **Collaboration & Partnerships** ==
133 +=== **5. Validation & Real-World Testing** ===
117 117  
118 -=== **Partnering with Data Providers** ===
135 +==== **Prospective Clinical Study** ====
119 119  
120 -Neurodiagnoses seeks partnerships with data repositories to:
137 +* **Multi-center validation** of AI-based **annotations & risk stratifications**.
138 +* **Benchmarking against clinician-based diagnoses**.
139 +* **Real-world testing** of AI-powered **structured reporting**.
121 121  
122 -* Enable **API-based data integration** for real-time processing.
123 -* Co-develop **harmonized AI-ready datasets** with standardized annotations.
124 -* Secure **funding opportunities** through joint grant applications.
141 +==== **Quality Assurance & Explainability** ====
125 125  
126 -**Interested in Partnering?**
143 +* **Annotations linked to structured knowledge graphs** for improved transparency.
144 +* **Interactive annotation editor** allows clinicians to validate AI outputs.
127 127  
128 -* If you represent a **research consortium or database provider**, reach out to explore **data-sharing agreements**.
129 -* **Contact**: [[info@neurodiagnoses.com>>mailto:info@neurodiagnoses.com]]
146 +----
130 130  
148 +=== **6. Collaborative Development** ===
149 +
150 +The project is **open to contributions** from **researchers, clinicians, and developers**.
151 +
152 +**Key tools include:**
153 +
154 +* **Jupyter Notebooks**: For data analysis and pipeline development.
155 +** Example: **probabilistic imputation**
156 +* **Wiki Pages**: For documenting methods and results.
157 +* **Drive and Bucket**: For sharing code, data, and outputs.
158 +* **Collaboration with related projects**:
159 +** Example: **Beyond the hype: AI in dementia – from early risk detection to disease treatment**
160 +
131 131  ----
132 132  
133 -== **Final Notes** ==
163 +=== **7. Tools and Technologies** ===
134 134  
135 -Neurodiagnoses continuously expands its **data ecosystem** to support **AI-driven clinical decision-making**. Researchers and institutions are encouraged to **contribute new datasets and methodologies**.
165 +==== **Programming Languages:** ====
136 136  
137 -**For additional technical documentation**:
167 +* **Python** for AI and data processing.
138 138  
139 -* **GitHub Repository**: [[Neurodiagnoses GitHub>>url:https://github.com/neurodiagnoses]]
140 -* **EBRAINS Collaboration Page**: [[EBRAINS Neurodiagnoses>>url:https://ebrains.eu/collabs/neurodiagnoses]]
169 +==== **Frameworks:** ====
141 141  
142 -**If you experience issues integrating data**, open a **GitHub Issue** or consult the **EBRAINS Neurodiagnoses Forum**.
171 +* **TensorFlow** and **PyTorch** for machine learning.
172 +* **Flask** or **FastAPI** for backend services.
143 143  
174 +==== **Visualization:** ====
175 +
176 +* **Plotly** and **Matplotlib** for interactive and static visualizations.
177 +
178 +==== **EBRAINS Services:** ====
179 +
180 +* **Collaboratory Lab** for running Notebooks.
181 +* **Buckets** for storing large datasets.
182 +
144 144  ----
145 145  
146 -This **updated methodology** now incorporates [[https:~~/~~/github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/data/biomarker_ontology>>https://Neuromarker]] for standardized biomarker classification, enabling **cross-disease AI training** across neurodegenerative disorders.
185 +=== **Why This Matters** ===
186 +
187 +* **The annotation system ensures that AI-generated insights are structured, interpretable, and clinically meaningful.**
188 +* **It enables real-time tracking of disease progression across the three diagnostic axes.**
189 +* **It facilitates integration with electronic health records and decision-support tools, improving AI adoption in clinical workflows.**