Wiki source code of Methodology
Version 15.1 by manuelmenendez on 2025/02/09 10:08
Hide last authors
author | version | line-number | content |
---|---|---|---|
![]() |
15.1 | 1 | == **Overview** == |
![]() |
1.1 | 2 | |
![]() |
6.1 | 3 | This project develops a **tridimensional diagnostic framework** for **CNS diseases**, incorporating **AI-powered annotation tools** to improve **interpretability, standardization, and clinical utility**. The methodology integrates **multi-modal data**, including **genetic, neuroimaging, neurophysiological, and biomarker datasets**, and applies **machine learning models** to generate **structured, explainable diagnostic outputs**. |
![]() |
1.1 | 4 | |
![]() |
15.1 | 5 | == **Workflow** == |
![]() |
7.1 | 6 | |
7 | 1. ((( | ||
![]() |
12.1 | 8 | **We Use GitHub to [[Store and develop AI models, scripts, and annotation pipelines.>>https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/discussions]]** |
![]() |
7.1 | 9 | |
10 | * Create a **GitHub repository** for AI scripts and models. | ||
11 | * Use **GitHub Projects** to manage research milestones. | ||
12 | ))) | ||
13 | 1. ((( | ||
14 | **We Use EBRAINS for Data & Collaboration** | ||
15 | |||
16 | * Store **biomarker and neuroimaging data** in **EBRAINS Buckets**. | ||
17 | * Run **Jupyter Notebooks** in **EBRAINS Lab** to test AI models. | ||
18 | * Use **EBRAINS Wiki** for structured documentation and research discussion. | ||
19 | ))) | ||
20 | |||
![]() |
1.1 | 21 | ---- |
22 | |||
![]() |
15.1 | 23 | == **1. Data Integration** == |
![]() |
1.1 | 24 | |
![]() |
14.1 | 25 | === **EBRAINS Medical Informatics Platform (MIP)**. === |
![]() |
12.2 | 26 | |
27 | Neurodiagnoses integrates clinical data via the **EBRAINS Medical Informatics Platform (MIP)**. MIP federates decentralized clinical data, allowing Neurodiagnoses to securely access and process sensitive information for AI-based diagnostics. | ||
28 | |||
![]() |
14.1 | 29 | ==== How It Works ==== |
![]() |
12.2 | 30 | |
31 | |||
32 | 1. ((( | ||
33 | **Authentication & API Access:** | ||
34 | |||
35 | * Users must have an **EBRAINS account**. | ||
36 | * Neurodiagnoses uses **secure API endpoints** to fetch clinical data (e.g., from the **Federation for Dementia**). | ||
37 | ))) | ||
38 | 1. ((( | ||
39 | **Data Mapping & Harmonization:** | ||
40 | |||
41 | * Retrieved data is **normalized** and converted to standard formats (.csv, .json). | ||
42 | * Data from **multiple sources** is harmonized to ensure consistency for AI processing. | ||
43 | ))) | ||
44 | 1. ((( | ||
45 | **Security & Compliance:** | ||
46 | |||
47 | * All data access is **logged and monitored**. | ||
48 | * Data remains on **MIP servers** using **federated learning techniques** when possible. | ||
49 | * Access is granted only after signing a **Data Usage Agreement (DUA)**. | ||
50 | ))) | ||
51 | |||
![]() |
14.1 | 52 | ==== Implementation Steps ==== |
![]() |
12.2 | 53 | |
54 | |||
55 | 1. Clone the repository. | ||
56 | 1. Configure your **EBRAINS API credentials** in mip_integration.py. | ||
57 | 1. Run the script to **download and harmonize clinical data**. | ||
58 | 1. Process the data for **AI model training**. | ||
59 | |||
60 | For more detailed instructions, please refer to the **[[MIP Documentation>>url:https://mip.ebrains.eu/]]**. | ||
61 | |||
62 | ---- | ||
63 | |||
![]() |
14.1 | 64 | === Data Processing & Integration with Clinica.Run === |
![]() |
12.2 | 65 | |
66 | Neurodiagnoses now supports **Clinica.Run**, an open-source neuroimaging platform designed for **multimodal data processing and reproducible neuroscience workflows**. | ||
67 | |||
![]() |
14.1 | 68 | ==== How It Works ==== |
![]() |
12.2 | 69 | |
70 | |||
71 | 1. ((( | ||
72 | **Neuroimaging Preprocessing:** | ||
73 | |||
74 | * MRI, PET, EEG data is preprocessed using **Clinica.Run pipelines**. | ||
75 | * Supports **longitudinal and cross-sectional analyses**. | ||
76 | ))) | ||
77 | 1. ((( | ||
78 | **Automated Biomarker Extraction:** | ||
79 | |||
80 | * Standardized extraction of **volumetric, metabolic, and functional biomarkers**. | ||
81 | * Integration with machine learning models in Neurodiagnoses. | ||
82 | ))) | ||
83 | 1. ((( | ||
84 | **Data Security & Compliance:** | ||
85 | |||
86 | * Clinica.Run operates in **compliance with GDPR and HIPAA**. | ||
87 | * Neuroimaging data remains **within the original storage environment**. | ||
88 | ))) | ||
89 | |||
![]() |
14.1 | 90 | ==== Implementation Steps ==== |
![]() |
12.2 | 91 | |
92 | |||
93 | 1. Install **Clinica.Run** dependencies. | ||
94 | 1. Configure your **Clinica.Run pipeline** in clinica_run_config.json. | ||
95 | 1. Run the pipeline for **preprocessing and biomarker extraction**. | ||
96 | 1. Use processed neuroimaging data for **AI-driven diagnostics** in Neurodiagnoses. | ||
97 | |||
98 | For further information, refer to **[[Clinica.Run Documentation>>url:https://clinica.run/]]**. | ||
99 | |||
100 | ==== ==== | ||
101 | |||
![]() |
1.1 | 102 | ==== **Data Sources** ==== |
103 | |||
![]() |
12.2 | 104 | [[List of potential sources of databases>>https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/data/sources/list_of_potential_databases]] |
105 | |||
![]() |
6.1 | 106 | **Biomedical Ontologies & Databases:** |
![]() |
1.1 | 107 | |
![]() |
6.1 | 108 | * **Human Phenotype Ontology (HPO)** for symptom annotation. |
109 | * **Gene Ontology (GO)** for molecular and cellular processes. | ||
![]() |
4.2 | 110 | |
![]() |
6.1 | 111 | **Dimensionality Reduction and Interpretability:** |
![]() |
1.1 | 112 | |
![]() |
6.1 | 113 | * **Evaluate interpretability** using metrics like the **Area Under the Interpretability Curve (AUIC)**. |
![]() |
11.1 | 114 | * **Leverage [[DEIBO>>https://github.com/Mellandd/DEIBO]] (Data-driven Embedding Interpretation Based on Ontologies)** to connect model dimensions to ontology concepts. |
![]() |
1.1 | 115 | |
![]() |
6.1 | 116 | **Neuroimaging & EEG/MEG Data:** |
117 | |||
118 | * **MRI volumetric measures** for brain atrophy tracking. | ||
119 | * **EEG functional connectivity patterns** (AI-Mind). | ||
120 | |||
121 | **Clinical & Biomarker Data:** | ||
122 | |||
123 | * **CSF biomarkers** (Amyloid-beta, Tau, Neurofilament Light). | ||
124 | * **Sleep monitoring and actigraphy data** (ADIS). | ||
125 | |||
126 | **Federated Learning Integration:** | ||
127 | |||
128 | * **Secure multi-center data harmonization** (PROMINENT). | ||
129 | |||
![]() |
1.1 | 130 | ---- |
131 | |||
![]() |
6.1 | 132 | ==== **Annotation System for Multi-Modal Data** ==== |
133 | |||
134 | To ensure **structured integration of diverse datasets**, **Neurodiagnoses** will implement an **AI-driven annotation system**, which will: | ||
135 | |||
136 | * **Assign standardized metadata tags** to diagnostic features. | ||
137 | * **Provide contextual explanations** for AI-based classifications. | ||
138 | * **Track temporal disease progression annotations** to identify long-term trends. | ||
139 | |||
140 | ---- | ||
141 | |||
![]() |
15.1 | 142 | == **2. AI-Based Analysis** == |
![]() |
1.1 | 143 | |
![]() |
6.1 | 144 | ==== **Machine Learning & Deep Learning Models** ==== |
![]() |
1.1 | 145 | |
![]() |
6.1 | 146 | **Risk Prediction Models:** |
![]() |
1.1 | 147 | |
![]() |
6.1 | 148 | * **LETHE’s cognitive risk prediction model** integrated into the annotation framework. |
![]() |
1.1 | 149 | |
![]() |
6.1 | 150 | **Biomarker Classification & Probabilistic Imputation:** |
![]() |
1.1 | 151 | |
![]() |
6.1 | 152 | * **KNN Imputer** and **Bayesian models** used for handling **missing biomarker data**. |
153 | |||
154 | **Neuroimaging Feature Extraction:** | ||
155 | |||
156 | * **MRI & EEG data** annotated with **neuroanatomical feature labels**. | ||
157 | |||
158 | ==== **AI-Powered Annotation System** ==== | ||
159 | |||
160 | * Uses **SHAP-based interpretability tools** to explain model decisions. | ||
161 | * Generates **automated clinical annotations** in structured reports. | ||
162 | * Links findings to **standardized medical ontologies** (e.g., **SNOMED, HPO**). | ||
163 | |||
![]() |
1.1 | 164 | ---- |
165 | |||
![]() |
15.1 | 166 | == **3. Diagnostic Framework & Clinical Decision Support** == |
![]() |
1.1 | 167 | |
![]() |
6.1 | 168 | ==== **Tridimensional Diagnostic Axes** ==== |
![]() |
1.1 | 169 | |
![]() |
6.1 | 170 | **Axis 1: Etiology (Pathogenic Mechanisms)** |
![]() |
1.1 | 171 | |
![]() |
6.1 | 172 | * Classification based on **genetic markers, cellular pathways, and environmental risk factors**. |
173 | * **AI-assisted annotation** provides **causal interpretations** for clinical use. | ||
![]() |
1.1 | 174 | |
![]() |
6.1 | 175 | **Axis 2: Molecular Markers & Biomarkers** |
![]() |
1.1 | 176 | |
![]() |
6.1 | 177 | * **Integration of CSF, blood, and neuroimaging biomarkers**. |
178 | * **Structured annotation** highlights **biological pathways linked to diagnosis**. | ||
![]() |
1.1 | 179 | |
![]() |
6.1 | 180 | **Axis 3: Neuroanatomoclinical Correlations** |
181 | |||
182 | * **MRI and EEG data** provide anatomical and functional insights. | ||
183 | * **AI-generated progression maps** annotate **brain structure-function relationships**. | ||
184 | |||
![]() |
1.1 | 185 | ---- |
186 | |||
![]() |
15.1 | 187 | == **4. Computational Workflow & Annotation Pipelines** == |
![]() |
1.1 | 188 | |
![]() |
6.1 | 189 | ==== **Data Processing Steps** ==== |
![]() |
1.1 | 190 | |
![]() |
6.1 | 191 | **Data Ingestion:** |
192 | |||
193 | * **Harmonized datasets** stored in **EBRAINS Bucket**. | ||
194 | * **Preprocessing pipelines** clean and standardize data. | ||
195 | |||
196 | **Feature Engineering:** | ||
197 | |||
198 | * **AI models** extract **clinically relevant patterns** from **EEG, MRI, and biomarkers**. | ||
199 | |||
200 | **AI-Generated Annotations:** | ||
201 | |||
202 | * **Automated tagging** of diagnostic features in **structured reports**. | ||
203 | * **Explainability modules (SHAP, LIME)** ensure transparency in predictions. | ||
204 | |||
205 | **Clinical Decision Support Integration:** | ||
206 | |||
207 | * **AI-annotated findings** fed into **interactive dashboards**. | ||
208 | * **Clinicians can adjust, validate, and modify annotations**. | ||
209 | |||
![]() |
1.1 | 210 | ---- |
211 | |||
![]() |
15.1 | 212 | == **5. Validation & Real-World Testing** == |
![]() |
1.1 | 213 | |
![]() |
6.1 | 214 | ==== **Prospective Clinical Study** ==== |
![]() |
1.1 | 215 | |
![]() |
6.1 | 216 | * **Multi-center validation** of AI-based **annotations & risk stratifications**. |
217 | * **Benchmarking against clinician-based diagnoses**. | ||
218 | * **Real-world testing** of AI-powered **structured reporting**. | ||
![]() |
1.1 | 219 | |
![]() |
6.1 | 220 | ==== **Quality Assurance & Explainability** ==== |
![]() |
1.1 | 221 | |
![]() |
6.1 | 222 | * **Annotations linked to structured knowledge graphs** for improved transparency. |
223 | * **Interactive annotation editor** allows clinicians to validate AI outputs. | ||
![]() |
1.1 | 224 | |
225 | ---- | ||
226 | |||
![]() |
15.1 | 227 | == **6. Collaborative Development** == |
![]() |
1.1 | 228 | |
![]() |
6.1 | 229 | The project is **open to contributions** from **researchers, clinicians, and developers**. |
![]() |
1.1 | 230 | |
![]() |
6.1 | 231 | **Key tools include:** |
232 | |||
![]() |
1.1 | 233 | * **Jupyter Notebooks**: For data analysis and pipeline development. |
![]() |
6.1 | 234 | ** Example: **probabilistic imputation** |
![]() |
1.1 | 235 | * **Wiki Pages**: For documenting methods and results. |
236 | * **Drive and Bucket**: For sharing code, data, and outputs. | ||
![]() |
6.1 | 237 | * **Collaboration with related projects**: |
238 | ** Example: **Beyond the hype: AI in dementia – from early risk detection to disease treatment** | ||
![]() |
1.1 | 239 | |
240 | ---- | ||
241 | |||
![]() |
15.1 | 242 | == **7. Tools and Technologies** == |
![]() |
1.1 | 243 | |
![]() |
6.1 | 244 | ==== **Programming Languages:** ==== |
245 | |||
246 | * **Python** for AI and data processing. | ||
247 | |||
248 | ==== **Frameworks:** ==== | ||
249 | |||
250 | * **TensorFlow** and **PyTorch** for machine learning. | ||
251 | * **Flask** or **FastAPI** for backend services. | ||
252 | |||
253 | ==== **Visualization:** ==== | ||
254 | |||
255 | * **Plotly** and **Matplotlib** for interactive and static visualizations. | ||
256 | |||
257 | ==== **EBRAINS Services:** ==== | ||
258 | |||
259 | * **Collaboratory Lab** for running Notebooks. | ||
260 | * **Buckets** for storing large datasets. | ||
261 | |||
262 | ---- | ||
263 | |||
264 | === **Why This Matters** === | ||
265 | |||
![]() |
12.1 | 266 | * The annotation system ensures that AI-generated insights are structured, interpretable, and clinically meaningful. |
267 | * It enables real-time tracking of disease progression across the three diagnostic axes. | ||
268 | * It facilitates integration with electronic health records and decision-support tools, improving AI adoption in clinical workflows. |