Wiki source code of Methodology
Version 13.1 by manuelmenendez on 2025/02/09 09:56
Show last authors
| author | version | line-number | content |
|---|---|---|---|
| 1 | ==== **Overview** ==== | ||
| 2 | |||
| 3 | This project develops a **tridimensional diagnostic framework** for **CNS diseases**, incorporating **AI-powered annotation tools** to improve **interpretability, standardization, and clinical utility**. The methodology integrates **multi-modal data**, including **genetic, neuroimaging, neurophysiological, and biomarker datasets**, and applies **machine learning models** to generate **structured, explainable diagnostic outputs**. | ||
| 4 | |||
| 5 | === **Workflow** === | ||
| 6 | |||
| 7 | 1. ((( | ||
| 8 | **We Use GitHub to [[Store and develop AI models, scripts, and annotation pipelines.>>https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/discussions]]** | ||
| 9 | |||
| 10 | * Create a **GitHub repository** for AI scripts and models. | ||
| 11 | * Use **GitHub Projects** to manage research milestones. | ||
| 12 | ))) | ||
| 13 | 1. ((( | ||
| 14 | **We Use EBRAINS for Data & Collaboration** | ||
| 15 | |||
| 16 | * Store **biomarker and neuroimaging data** in **EBRAINS Buckets**. | ||
| 17 | * Run **Jupyter Notebooks** in **EBRAINS Lab** to test AI models. | ||
| 18 | * Use **EBRAINS Wiki** for structured documentation and research discussion. | ||
| 19 | ))) | ||
| 20 | |||
| 21 | ---- | ||
| 22 | |||
| 23 | === **1. Data Integration** === | ||
| 24 | |||
| 25 | == Overview == | ||
| 26 | |||
| 27 | |||
| 28 | Neurodiagnoses integrates clinical data via the **EBRAINS Medical Informatics Platform (MIP)**. MIP federates decentralized clinical data, allowing Neurodiagnoses to securely access and process sensitive information for AI-based diagnostics. | ||
| 29 | |||
| 30 | == How It Works == | ||
| 31 | |||
| 32 | |||
| 33 | 1. ((( | ||
| 34 | **Authentication & API Access:** | ||
| 35 | |||
| 36 | * Users must have an **EBRAINS account**. | ||
| 37 | * Neurodiagnoses uses **secure API endpoints** to fetch clinical data (e.g., from the **Federation for Dementia**). | ||
| 38 | ))) | ||
| 39 | 1. ((( | ||
| 40 | **Data Mapping & Harmonization:** | ||
| 41 | |||
| 42 | * Retrieved data is **normalized** and converted to standard formats (.csv, .json). | ||
| 43 | * Data from **multiple sources** is harmonized to ensure consistency for AI processing. | ||
| 44 | ))) | ||
| 45 | 1. ((( | ||
| 46 | **Security & Compliance:** | ||
| 47 | |||
| 48 | * All data access is **logged and monitored**. | ||
| 49 | * Data remains on **MIP servers** using **federated learning techniques** when possible. | ||
| 50 | * Access is granted only after signing a **Data Usage Agreement (DUA)**. | ||
| 51 | ))) | ||
| 52 | |||
| 53 | == Implementation Steps == | ||
| 54 | |||
| 55 | |||
| 56 | 1. Clone the repository. | ||
| 57 | 1. Configure your **EBRAINS API credentials** in mip_integration.py. | ||
| 58 | 1. Run the script to **download and harmonize clinical data**. | ||
| 59 | 1. Process the data for **AI model training**. | ||
| 60 | |||
| 61 | For more detailed instructions, please refer to the **[[MIP Documentation>>url:https://mip.ebrains.eu/]]**. | ||
| 62 | |||
| 63 | ---- | ||
| 64 | |||
| 65 | = Data Processing & Integration with Clinica.Run = | ||
| 66 | |||
| 67 | |||
| 68 | == Overview == | ||
| 69 | |||
| 70 | |||
| 71 | Neurodiagnoses now supports **Clinica.Run**, an open-source neuroimaging platform designed for **multimodal data processing and reproducible neuroscience workflows**. | ||
| 72 | |||
| 73 | == How It Works == | ||
| 74 | |||
| 75 | |||
| 76 | 1. ((( | ||
| 77 | **Neuroimaging Preprocessing:** | ||
| 78 | |||
| 79 | * MRI, PET, EEG data is preprocessed using **Clinica.Run pipelines**. | ||
| 80 | * Supports **longitudinal and cross-sectional analyses**. | ||
| 81 | ))) | ||
| 82 | 1. ((( | ||
| 83 | **Automated Biomarker Extraction:** | ||
| 84 | |||
| 85 | * Standardized extraction of **volumetric, metabolic, and functional biomarkers**. | ||
| 86 | * Integration with machine learning models in Neurodiagnoses. | ||
| 87 | ))) | ||
| 88 | 1. ((( | ||
| 89 | **Data Security & Compliance:** | ||
| 90 | |||
| 91 | * Clinica.Run operates in **compliance with GDPR and HIPAA**. | ||
| 92 | * Neuroimaging data remains **within the original storage environment**. | ||
| 93 | ))) | ||
| 94 | |||
| 95 | == Implementation Steps == | ||
| 96 | |||
| 97 | |||
| 98 | 1. Install **Clinica.Run** dependencies. | ||
| 99 | 1. Configure your **Clinica.Run pipeline** in clinica_run_config.json. | ||
| 100 | 1. Run the pipeline for **preprocessing and biomarker extraction**. | ||
| 101 | 1. Use processed neuroimaging data for **AI-driven diagnostics** in Neurodiagnoses. | ||
| 102 | |||
| 103 | For further information, refer to **[[Clinica.Run Documentation>>url:https://clinica.run/]]**. | ||
| 104 | |||
| 105 | ==== ==== | ||
| 106 | |||
| 107 | ==== **Data Sources** ==== | ||
| 108 | |||
| 109 | [[List of potential sources of databases>>https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/data/sources/list_of_potential_databases]] | ||
| 110 | |||
| 111 | **Biomedical Ontologies & Databases:** | ||
| 112 | |||
| 113 | * **Human Phenotype Ontology (HPO)** for symptom annotation. | ||
| 114 | * **Gene Ontology (GO)** for molecular and cellular processes. | ||
| 115 | |||
| 116 | **Dimensionality Reduction and Interpretability:** | ||
| 117 | |||
| 118 | * **Evaluate interpretability** using metrics like the **Area Under the Interpretability Curve (AUIC)**. | ||
| 119 | * **Leverage [[DEIBO>>https://github.com/Mellandd/DEIBO]] (Data-driven Embedding Interpretation Based on Ontologies)** to connect model dimensions to ontology concepts. | ||
| 120 | |||
| 121 | **Neuroimaging & EEG/MEG Data:** | ||
| 122 | |||
| 123 | * **MRI volumetric measures** for brain atrophy tracking. | ||
| 124 | * **EEG functional connectivity patterns** (AI-Mind). | ||
| 125 | |||
| 126 | **Clinical & Biomarker Data:** | ||
| 127 | |||
| 128 | * **CSF biomarkers** (Amyloid-beta, Tau, Neurofilament Light). | ||
| 129 | * **Sleep monitoring and actigraphy data** (ADIS). | ||
| 130 | |||
| 131 | **Federated Learning Integration:** | ||
| 132 | |||
| 133 | * **Secure multi-center data harmonization** (PROMINENT). | ||
| 134 | |||
| 135 | ---- | ||
| 136 | |||
| 137 | ==== **Annotation System for Multi-Modal Data** ==== | ||
| 138 | |||
| 139 | To ensure **structured integration of diverse datasets**, **Neurodiagnoses** will implement an **AI-driven annotation system**, which will: | ||
| 140 | |||
| 141 | * **Assign standardized metadata tags** to diagnostic features. | ||
| 142 | * **Provide contextual explanations** for AI-based classifications. | ||
| 143 | * **Track temporal disease progression annotations** to identify long-term trends. | ||
| 144 | |||
| 145 | ---- | ||
| 146 | |||
| 147 | === **2. AI-Based Analysis** === | ||
| 148 | |||
| 149 | ==== **Machine Learning & Deep Learning Models** ==== | ||
| 150 | |||
| 151 | **Risk Prediction Models:** | ||
| 152 | |||
| 153 | * **LETHE’s cognitive risk prediction model** integrated into the annotation framework. | ||
| 154 | |||
| 155 | **Biomarker Classification & Probabilistic Imputation:** | ||
| 156 | |||
| 157 | * **KNN Imputer** and **Bayesian models** used for handling **missing biomarker data**. | ||
| 158 | |||
| 159 | **Neuroimaging Feature Extraction:** | ||
| 160 | |||
| 161 | * **MRI & EEG data** annotated with **neuroanatomical feature labels**. | ||
| 162 | |||
| 163 | ==== **AI-Powered Annotation System** ==== | ||
| 164 | |||
| 165 | * Uses **SHAP-based interpretability tools** to explain model decisions. | ||
| 166 | * Generates **automated clinical annotations** in structured reports. | ||
| 167 | * Links findings to **standardized medical ontologies** (e.g., **SNOMED, HPO**). | ||
| 168 | |||
| 169 | ---- | ||
| 170 | |||
| 171 | === **3. Diagnostic Framework & Clinical Decision Support** === | ||
| 172 | |||
| 173 | ==== **Tridimensional Diagnostic Axes** ==== | ||
| 174 | |||
| 175 | **Axis 1: Etiology (Pathogenic Mechanisms)** | ||
| 176 | |||
| 177 | * Classification based on **genetic markers, cellular pathways, and environmental risk factors**. | ||
| 178 | * **AI-assisted annotation** provides **causal interpretations** for clinical use. | ||
| 179 | |||
| 180 | **Axis 2: Molecular Markers & Biomarkers** | ||
| 181 | |||
| 182 | * **Integration of CSF, blood, and neuroimaging biomarkers**. | ||
| 183 | * **Structured annotation** highlights **biological pathways linked to diagnosis**. | ||
| 184 | |||
| 185 | **Axis 3: Neuroanatomoclinical Correlations** | ||
| 186 | |||
| 187 | * **MRI and EEG data** provide anatomical and functional insights. | ||
| 188 | * **AI-generated progression maps** annotate **brain structure-function relationships**. | ||
| 189 | |||
| 190 | ---- | ||
| 191 | |||
| 192 | === **4. Computational Workflow & Annotation Pipelines** === | ||
| 193 | |||
| 194 | ==== **Data Processing Steps** ==== | ||
| 195 | |||
| 196 | **Data Ingestion:** | ||
| 197 | |||
| 198 | * **Harmonized datasets** stored in **EBRAINS Bucket**. | ||
| 199 | * **Preprocessing pipelines** clean and standardize data. | ||
| 200 | |||
| 201 | **Feature Engineering:** | ||
| 202 | |||
| 203 | * **AI models** extract **clinically relevant patterns** from **EEG, MRI, and biomarkers**. | ||
| 204 | |||
| 205 | **AI-Generated Annotations:** | ||
| 206 | |||
| 207 | * **Automated tagging** of diagnostic features in **structured reports**. | ||
| 208 | * **Explainability modules (SHAP, LIME)** ensure transparency in predictions. | ||
| 209 | |||
| 210 | **Clinical Decision Support Integration:** | ||
| 211 | |||
| 212 | * **AI-annotated findings** fed into **interactive dashboards**. | ||
| 213 | * **Clinicians can adjust, validate, and modify annotations**. | ||
| 214 | |||
| 215 | ---- | ||
| 216 | |||
| 217 | === **5. Validation & Real-World Testing** === | ||
| 218 | |||
| 219 | ==== **Prospective Clinical Study** ==== | ||
| 220 | |||
| 221 | * **Multi-center validation** of AI-based **annotations & risk stratifications**. | ||
| 222 | * **Benchmarking against clinician-based diagnoses**. | ||
| 223 | * **Real-world testing** of AI-powered **structured reporting**. | ||
| 224 | |||
| 225 | ==== **Quality Assurance & Explainability** ==== | ||
| 226 | |||
| 227 | * **Annotations linked to structured knowledge graphs** for improved transparency. | ||
| 228 | * **Interactive annotation editor** allows clinicians to validate AI outputs. | ||
| 229 | |||
| 230 | ---- | ||
| 231 | |||
| 232 | === **6. Collaborative Development** === | ||
| 233 | |||
| 234 | The project is **open to contributions** from **researchers, clinicians, and developers**. | ||
| 235 | |||
| 236 | **Key tools include:** | ||
| 237 | |||
| 238 | * **Jupyter Notebooks**: For data analysis and pipeline development. | ||
| 239 | ** Example: **probabilistic imputation** | ||
| 240 | * **Wiki Pages**: For documenting methods and results. | ||
| 241 | * **Drive and Bucket**: For sharing code, data, and outputs. | ||
| 242 | * **Collaboration with related projects**: | ||
| 243 | ** Example: **Beyond the hype: AI in dementia – from early risk detection to disease treatment** | ||
| 244 | |||
| 245 | ---- | ||
| 246 | |||
| 247 | === **7. Tools and Technologies** === | ||
| 248 | |||
| 249 | ==== **Programming Languages:** ==== | ||
| 250 | |||
| 251 | * **Python** for AI and data processing. | ||
| 252 | |||
| 253 | ==== **Frameworks:** ==== | ||
| 254 | |||
| 255 | * **TensorFlow** and **PyTorch** for machine learning. | ||
| 256 | * **Flask** or **FastAPI** for backend services. | ||
| 257 | |||
| 258 | ==== **Visualization:** ==== | ||
| 259 | |||
| 260 | * **Plotly** and **Matplotlib** for interactive and static visualizations. | ||
| 261 | |||
| 262 | ==== **EBRAINS Services:** ==== | ||
| 263 | |||
| 264 | * **Collaboratory Lab** for running Notebooks. | ||
| 265 | * **Buckets** for storing large datasets. | ||
| 266 | |||
| 267 | ---- | ||
| 268 | |||
| 269 | === **Why This Matters** === | ||
| 270 | |||
| 271 | * The annotation system ensures that AI-generated insights are structured, interpretable, and clinically meaningful. | ||
| 272 | * It enables real-time tracking of disease progression across the three diagnostic axes. | ||
| 273 | * It facilitates integration with electronic health records and decision-support tools, improving AI adoption in clinical workflows. |