Warning:  Due to planned infrastructure maintenance, the EBRAINS Wiki and EBRAINS Support system will be unavailable for up to three days starting Monday, 14 July. During this period, both services will be inaccessible, and any emails sent to the support address will not be received.

Attention: We are currently experiencing some issues with the EBRAINS Drive. Please bear with us as we fix this issue. We apologise for any inconvenience caused.


Changes for page Methodology

Last modified by manuelmenendez on 2025/03/14 08:31

From version 3.1
edited by manuelmenendez
on 2025/01/27 23:28
Change comment: There is no comment for this version
To version 7.1
edited by manuelmenendez
on 2025/02/01 14:11
Change comment: There is no comment for this version

Summary

Details

Page properties
Content
... ... @@ -1,7 +1,23 @@
1 -=== **Overview** ===
1 +==== **Overview** ====
2 2  
3 -This section describes the step-by-step process used in the **Neurodiagnoses** project to develop a novel diagnostic framework for neurological diseases. The methodology integrates artificial intelligence (AI), biomedical ontologies, and computational neuroscience to create a structured, interpretable, and scalable diagnostic system.
3 +This project develops a **tridimensional diagnostic framework** for **CNS diseases**, incorporating **AI-powered annotation tools** to improve **interpretability, standardization, and clinical utility**. The methodology integrates **multi-modal data**, including **genetic, neuroimaging, neurophysiological, and biomarker datasets**, and applies **machine learning models** to generate **structured, explainable diagnostic outputs**.
4 4  
5 +=== **Workflow** ===
6 +
7 +1. (((
8 +**We Use GitHub for AI Development**
9 +
10 +* Create a **GitHub repository** for AI scripts and models.
11 +* Use **GitHub Projects** to manage research milestones.
12 +)))
13 +1. (((
14 +**We Use EBRAINS for Data & Collaboration**
15 +
16 +* Store **biomarker and neuroimaging data** in **EBRAINS Buckets**.
17 +* Run **Jupyter Notebooks** in **EBRAINS Lab** to test AI models.
18 +* Use **EBRAINS Wiki** for structured documentation and research discussion.
19 +)))
20 +
5 5  ----
6 6  
7 7  === **1. Data Integration** ===
... ... @@ -8,99 +8,166 @@
8 8  
9 9  ==== **Data Sources** ====
10 10  
11 -* **Biomedical Ontologies**:
12 -** Human Phenotype Ontology (HPO) for phenotypic abnormalities.
13 -** Gene Ontology (GO) for molecular and cellular processes.
14 -* **Neuroimaging Datasets**:
15 -** Example: Alzheimer’s Disease Neuroimaging Initiative (ADNI), OpenNeuro.
16 -* **Clinical and Biomarker Data**:
17 -** Anonymized clinical reports, molecular biomarkers, and test results.
27 +**Biomedical Ontologies & Databases:**
18 18  
19 -==== **Data Preprocessing** ====
29 +* **Human Phenotype Ontology (HPO)** for symptom annotation.
30 +* **Gene Ontology (GO)** for molecular and cellular processes.
20 20  
21 -1. **Standardization**: Ensure all data sources are normalized to a common format.
22 -1. **Feature Selection**: Identify relevant features for diagnosis (e.g., biomarkers, imaging scores).
23 -1. **Data Cleaning**: Handle missing values and remove duplicates.
32 +**Dimensionality Reduction and Interpretability:**
24 24  
34 +* **Evaluate interpretability** using metrics like the **Area Under the Interpretability Curve (AUIC)**.
35 +* **Leverage DEIBO (Data-driven Embedding Interpretation Based on Ontologies)** to connect model dimensions to ontology concepts.
36 +
37 +**Neuroimaging & EEG/MEG Data:**
38 +
39 +* **MRI volumetric measures** for brain atrophy tracking.
40 +* **EEG functional connectivity patterns** (AI-Mind).
41 +
42 +**Clinical & Biomarker Data:**
43 +
44 +* **CSF biomarkers** (Amyloid-beta, Tau, Neurofilament Light).
45 +* **Sleep monitoring and actigraphy data** (ADIS).
46 +
47 +**Federated Learning Integration:**
48 +
49 +* **Secure multi-center data harmonization** (PROMINENT).
50 +
25 25  ----
26 26  
53 +==== **Annotation System for Multi-Modal Data** ====
54 +
55 +To ensure **structured integration of diverse datasets**, **Neurodiagnoses** will implement an **AI-driven annotation system**, which will:
56 +
57 +* **Assign standardized metadata tags** to diagnostic features.
58 +* **Provide contextual explanations** for AI-based classifications.
59 +* **Track temporal disease progression annotations** to identify long-term trends.
60 +
61 +----
62 +
27 27  === **2. AI-Based Analysis** ===
28 28  
29 -==== **Model Development** ====
65 +==== **Machine Learning & Deep Learning Models** ====
30 30  
31 -* **Embedding Models**: Use pre-trained models like BioBERT or BioLORD for text data.
32 -* **Classification Models**:
33 -** Algorithms: Random Forest, Support Vector Machines (SVM), or neural networks.
34 -** Purpose: Predict the likelihood of specific neurological conditions based on input data.
67 +**Risk Prediction Models:**
35 35  
36 -==== **Dimensionality Reduction and Interpretability** ====
69 +* **LETHE’s cognitive risk prediction model** integrated into the annotation framework.
37 37  
38 -* Leverage [[DEIBO>>https://drive.ebrains.eu/f/8d7157708cde4b258db0/]] (Data-driven Embedding Interpretation Based on Ontologies) to connect model dimensions to ontology concepts.
39 -* Evaluate interpretability using metrics like the Area Under the Interpretability Curve (AUIC).
71 +**Biomarker Classification & Probabilistic Imputation:**
40 40  
73 +* **KNN Imputer** and **Bayesian models** used for handling **missing biomarker data**.
74 +
75 +**Neuroimaging Feature Extraction:**
76 +
77 +* **MRI & EEG data** annotated with **neuroanatomical feature labels**.
78 +
79 +==== **AI-Powered Annotation System** ====
80 +
81 +* Uses **SHAP-based interpretability tools** to explain model decisions.
82 +* Generates **automated clinical annotations** in structured reports.
83 +* Links findings to **standardized medical ontologies** (e.g., **SNOMED, HPO**).
84 +
41 41  ----
42 42  
43 -=== **3. Diagnostic Framework** ===
87 +=== **3. Diagnostic Framework & Clinical Decision Support** ===
44 44  
45 -==== **Axes of Diagnosis** ====
89 +==== **Tridimensional Diagnostic Axes** ====
46 46  
47 -The framework organizes diagnostic data into three axes:
91 +**Axis 1: Etiology (Pathogenic Mechanisms)**
48 48  
49 -1. **Etiology**: Genetic and environmental risk factors.
50 -1. **Molecular Markers**: Biomarkers such as amyloid-beta, tau, and alpha-synuclein.
51 -1. **Neuroanatomical Correlations**: Results from neuroimaging (e.g., MRI, PET).
93 +* Classification based on **genetic markers, cellular pathways, and environmental risk factors**.
94 +* **AI-assisted annotation** provides **causal interpretations** for clinical use.
52 52  
53 -==== **Recommendation System** ====
96 +**Axis 2: Molecular Markers & Biomarkers**
54 54  
55 -* Suggests additional tests or biomarkers if gaps are detected in the data.
56 -* Prioritizes tests based on clinical impact and cost-effectiveness.
98 +* **Integration of CSF, blood, and neuroimaging biomarkers**.
99 +* **Structured annotation** highlights **biological pathways linked to diagnosis**.
57 57  
101 +**Axis 3: Neuroanatomoclinical Correlations**
102 +
103 +* **MRI and EEG data** provide anatomical and functional insights.
104 +* **AI-generated progression maps** annotate **brain structure-function relationships**.
105 +
58 58  ----
59 59  
60 -=== **4. Computational Workflow** ===
108 +=== **4. Computational Workflow & Annotation Pipelines** ===
61 61  
62 -1. **Data Loading**: Import data from storage (Drive or Bucket).
63 -1. **Feature Engineering**: Generate derived features from the raw data.
64 -1. **Model Training**:
65 -1*. Split data into training, validation, and test sets.
66 -1*. Train models with cross-validation to ensure robustness.
67 -1. **Evaluation**:
68 -1*. Metrics: Accuracy, F1-Score, AUIC for interpretability.
69 -1*. Compare against baseline models and domain benchmarks.
110 +==== **Data Processing Steps** ====
70 70  
112 +**Data Ingestion:**
113 +
114 +* **Harmonized datasets** stored in **EBRAINS Bucket**.
115 +* **Preprocessing pipelines** clean and standardize data.
116 +
117 +**Feature Engineering:**
118 +
119 +* **AI models** extract **clinically relevant patterns** from **EEG, MRI, and biomarkers**.
120 +
121 +**AI-Generated Annotations:**
122 +
123 +* **Automated tagging** of diagnostic features in **structured reports**.
124 +* **Explainability modules (SHAP, LIME)** ensure transparency in predictions.
125 +
126 +**Clinical Decision Support Integration:**
127 +
128 +* **AI-annotated findings** fed into **interactive dashboards**.
129 +* **Clinicians can adjust, validate, and modify annotations**.
130 +
71 71  ----
72 72  
73 -=== **5. Validation** ===
133 +=== **5. Validation & Real-World Testing** ===
74 74  
75 -==== **Internal Validation** ====
135 +==== **Prospective Clinical Study** ====
76 76  
77 -* Test the system using simulated datasets and known clinical cases.
78 -* Fine-tune models based on validation results.
137 +* **Multi-center validation** of AI-based **annotations & risk stratifications**.
138 +* **Benchmarking against clinician-based diagnoses**.
139 +* **Real-world testing** of AI-powered **structured reporting**.
79 79  
80 -==== **External Validation** ====
141 +==== **Quality Assurance & Explainability** ====
81 81  
82 -* Collaborate with research institutions and hospitals to test the system in real-world settings.
83 -* Use anonymized patient data to ensure privacy compliance.
143 +* **Annotations linked to structured knowledge graphs** for improved transparency.
144 +* **Interactive annotation editor** allows clinicians to validate AI outputs.
84 84  
85 85  ----
86 86  
87 87  === **6. Collaborative Development** ===
88 88  
89 -The project is open to contributions from researchers, clinicians, and developers. Key tools include:
150 +The project is **open to contributions** from **researchers, clinicians, and developers**.
90 90  
152 +**Key tools include:**
153 +
91 91  * **Jupyter Notebooks**: For data analysis and pipeline development.
155 +** Example: **probabilistic imputation**
92 92  * **Wiki Pages**: For documenting methods and results.
93 93  * **Drive and Bucket**: For sharing code, data, and outputs.
158 +* **Collaboration with related projects**:
159 +** Example: **Beyond the hype: AI in dementia – from early risk detection to disease treatment**
94 94  
95 95  ----
96 96  
97 97  === **7. Tools and Technologies** ===
98 98  
99 -* **Programming Languages**: Python for AI and data processing.
100 -* **Frameworks**:
101 -** TensorFlow and PyTorch for machine learning.
102 -** Flask or FastAPI for backend services.
103 -* **Visualization**: Plotly and Matplotlib for interactive and static visualizations.
104 -* **EBRAINS Services**:
105 -** Collaboratory Lab for running Notebooks.
106 -** Buckets for storing large datasets.
165 +==== **Programming Languages:** ====
166 +
167 +* **Python** for AI and data processing.
168 +
169 +==== **Frameworks:** ====
170 +
171 +* **TensorFlow** and **PyTorch** for machine learning.
172 +* **Flask** or **FastAPI** for backend services.
173 +
174 +==== **Visualization:** ====
175 +
176 +* **Plotly** and **Matplotlib** for interactive and static visualizations.
177 +
178 +==== **EBRAINS Services:** ====
179 +
180 +* **Collaboratory Lab** for running Notebooks.
181 +* **Buckets** for storing large datasets.
182 +
183 +----
184 +
185 +=== **Why This Matters** ===
186 +
187 +* **The annotation system ensures that AI-generated insights are structured, interpretable, and clinically meaningful.**
188 +* **It enables real-time tracking of disease progression across the three diagnostic axes.**
189 +* **It facilitates integration with electronic health records and decision-support tools, improving AI adoption in clinical workflows.**