Attention: The EBRAINS drive will be unavailable for most of the weekend starting the 25th October. Although the Lab is availble while the Drive is down, files that are stored in the Drive will not be loaded and you will be unable to save documents directly on the Lab.


Changes for page Methodology

Last modified by manuelmenendez on 2025/03/14 08:31

From version 16.1
edited by manuelmenendez
on 2025/02/09 10:08
Change comment: There is no comment for this version
To version 17.1
edited by manuelmenendez
on 2025/02/09 13:01
Change comment: There is no comment for this version

Summary

Details

Page properties
Content
... ... @@ -1,260 +1,154 @@
1 1  == **Overview** ==
2 2  
3 -This project develops a **tridimensional diagnostic framework** for **CNS diseases**, incorporating **AI-powered annotation tools** to improve **interpretability, standardization, and clinical utility**. The methodology integrates **multi-modal data**, including **genetic, neuroimaging, neurophysiological, and biomarker datasets**, and applies **machine learning models** to generate **structured, explainable diagnostic outputs**.
3 +Neurodiagnoses develops a **tridimensional diagnostic framework** for **CNS diseases**, incorporating **AI-powered annotation tools** to improve **interpretability, standardization, and clinical utility.**
4 4  
5 -== **Workflow** ==
5 +This methodology integrates **multi-modal data**, including:
6 +**Genetic data** (whole-genome sequencing, polygenic risk scores).
7 +**Neuroimaging** (MRI, PET, EEG, MEG).
8 +**Neurophysiological data** (EEG-based biomarkers, sleep actigraphy).
9 +**CSF & Blood Biomarkers** (Amyloid-beta, Tau, Neurofilament Light).
6 6  
7 -1. (((
8 -**We Use GitHub to [[Store and develop AI models, scripts, and annotation pipelines.>>https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/discussions]]**
11 +By applying **machine learning models**, Neurodiagnoses generates **structured, explainable diagnostic outputs** to assist **clinical decision-making** and **biomarker-driven patient stratification.**
9 9  
10 -* Create a **GitHub repository** for AI scripts and models.
11 -* Use **GitHub Projects** to manage research milestones.
12 -)))
13 -1. (((
14 -**We Use EBRAINS for Data & Collaboration**
15 -
16 -* Store **biomarker and neuroimaging data** in **EBRAINS Buckets**.
17 -* Run **Jupyter Notebooks** in **EBRAINS Lab** to test AI models.
18 -* Use **EBRAINS Wiki** for structured documentation and research discussion.
19 -)))
20 -
21 21  ----
22 22  
23 -== **1. Data Integration** ==
15 +== **Data Integration & External Databases** ==
24 24  
25 -=== **EBRAINS Medical Informatics Platform (MIP)**. ===
17 +=== **How to Use External Databases in Neurodiagnoses** ===
26 26  
27 -Neurodiagnoses integrates clinical data via the **EBRAINS Medical Informatics Platform (MIP)**. MIP federates decentralized clinical data, allowing Neurodiagnoses to securely access and process sensitive information for AI-based diagnostics.
19 +Neurodiagnoses integrates data from multiple **biomedical and neurological research databases**. Researchers can follow these steps to **access, prepare, and integrate** data into the Neurodiagnoses framework.
28 28  
29 -==== How It Works ====
21 +**Potential Data Sources**
22 +**Reference:** [[List of Potential Databases>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/data/sources/list_of_potential_databases]]
30 30  
24 +=== **Register for Access** ===
31 31  
32 -1. (((
33 -**Authentication & API Access:**
26 +Each **external database** requires **individual registration** and approval.
27 +✔️ Follow the official **data access guidelines** of each provider.
28 +✔️ Ensure compliance with **ethical approvals** and **data-sharing agreements (DUAs).**
34 34  
35 -* Users must have an **EBRAINS account**.
36 -* Neurodiagnoses uses **secure API endpoints** to fetch clinical data (e.g., from the **Federation for Dementia**).
37 -)))
38 -1. (((
39 -**Data Mapping & Harmonization:**
30 +=== **Download & Prepare Data** ===
40 40  
41 -* Retrieved data is **normalized** and converted to standard formats (.csv, .json).
42 -* Data from **multiple sources** is harmonized to ensure consistency for AI processing.
43 -)))
44 -1. (((
45 -**Security & Compliance:**
32 +Once access is granted, download datasets **following compliance guidelines** and **format requirements** for integration.
46 46  
47 -* All data access is **logged and monitored**.
48 -* Data remains on **MIP servers** using **federated learning techniques** when possible.
49 -* Access is granted only after signing a **Data Usage Agreement (DUA)**.
50 -)))
34 +**Supported File Formats**
51 51  
52 -==== Implementation Steps ====
36 +* **Tabular Data**: .csv, .tsv
37 +* **Neuroimaging Data**: .nii, .dcm
38 +* **Genomic Data**: .fasta, .vcf
39 +* **Clinical Metadata**: .json, .xml
53 53  
41 +**Mandatory Fields for Integration**
54 54  
55 -1. Clone the repository.
56 -1. Configure your **EBRAINS API credentials** in mip_integration.py.
57 -1. Run the script to **download and harmonize clinical data**.
58 -1. Process the data for **AI model training**.
43 +|=**Field Name**|=**Description**
44 +|**Subject ID**|Unique patient identifier
45 +|**Diagnosis**|Standardized disease classification
46 +|**Biomarkers**|CSF, plasma, or imaging biomarkers
47 +|**Genetic Data**|Whole-genome or exome sequencing
48 +|**Neuroimaging Metadata**|MRI/PET acquisition parameters
59 59  
60 -For more detailed instructions, please refer to the **[[MIP Documentation>>url:https://mip.ebrains.eu/]]**.
50 +=== **Upload Data to Neurodiagnoses** ===
61 61  
62 -----
52 +**Option 1:** Upload to **EBRAINS Bucket** → [[Neurodiagnoses Data Storage>>url:https://wiki.ebrains.eu/bin/view/Collabs/neurodiagnoses/Bucket]]
53 +**Option 2:** Contribute via **GitHub Repository** → [[GitHub Data Repository>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/tree/main/data]]
63 63  
64 -=== Data Processing & Integration with Clinica.Run ===
55 +**For large datasets, please contact project administrators before uploading.**
65 65  
66 -Neurodiagnoses now supports **Clinica.Run**, an open-source neuroimaging platform designed for **multimodal data processing and reproducible neuroscience workflows**.
57 +=== **Integrate Data into AI Models** ===
67 67  
68 -==== How It Works ====
59 +Use **Jupyter Notebooks** on EBRAINS for **data preprocessing.**
60 +Standardize data using **harmonization tools.**
61 +Train AI models with **newly integrated datasets.**
69 69  
63 +**Reference:** [[Data Processing Guide>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/data_processing.md]]
70 70  
71 -1. (((
72 -**Neuroimaging Preprocessing:**
65 +----
73 73  
74 -* MRI, PET, EEG data is preprocessed using **Clinica.Run pipelines**.
75 -* Supports **longitudinal and cross-sectional analyses**.
76 -)))
77 -1. (((
78 -**Automated Biomarker Extraction:**
67 +== **AI-Powered Annotation & Machine Learning Models** ==
79 79  
80 -* Standardized extraction of **volumetric, metabolic, and functional biomarkers**.
81 -* Integration with machine learning models in Neurodiagnoses.
82 -)))
83 -1. (((
84 -**Data Security & Compliance:**
69 +Neurodiagnoses applies **advanced machine learning models** to classify CNS diseases, extract features from **biomarkers and neuroimaging**, and provide **AI-powered annotation.**
85 85  
86 -* Clinica.Run operates in **compliance with GDPR and HIPAA**.
87 -* Neuroimaging data remains **within the original storage environment**.
88 -)))
71 +=== **AI Model Categories** ===
89 89  
90 -==== Implementation Steps ====
73 +|=**Model Type**|=**Function**|=**Example Algorithms**
74 +|**Probabilistic Diagnosis**|Assigns probability scores to multiple CNS disorders.|Random Forest, XGBoost, Bayesian Networks
75 +|**Tridimensional Diagnosis**|Classifies disorders based on Etiology, Biomarkers, and Neuroanatomical Correlations.|CNNs, Transformers, Autoencoders
76 +|**Biomarker Prediction**|Predicts missing biomarker values using regression.|KNN Imputation, Bayesian Estimation
77 +|**Neuroimaging Feature Extraction**|Extracts patterns from MRI, PET, EEG.|CNNs, Graph Neural Networks
78 +|**Clinical Decision Support**|Generates AI-driven diagnostic reports.|SHAP Explainability Tools
91 91  
80 +**Reference:** [[AI Model Documentation>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/models.md]]
92 92  
93 -1. Install **Clinica.Run** dependencies.
94 -1. Configure your **Clinica.Run pipeline** in clinica_run_config.json.
95 -1. Run the pipeline for **preprocessing and biomarker extraction**.
96 -1. Use processed neuroimaging data for **AI-driven diagnostics** in Neurodiagnoses.
82 +----
97 97  
98 -For further information, refer to **[[Clinica.Run Documentation>>url:https://clinica.run/]]**.
84 +== **Clinical Decision Support & Tridimensional Diagnostic Framework** ==
99 99  
100 -==== ====
86 +Neurodiagnoses generates **structured AI reports** for clinicians, combining:
101 101  
102 -==== **Data Sources** ====
88 +**Probabilistic Diagnosis:** AI-generated ranking of potential diagnoses.
89 +**Tridimensional Classification:** Standardized diagnostic reports based on:
103 103  
104 -[[List of potential sources of databases>>https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/data/sources/list_of_potential_databases]]
91 +1. **Axis 1:** **Etiology** → Genetic, Autoimmune, Prion, Toxic, Vascular.
92 +1. **Axis 2:** **Molecular Markers** → CSF, Neuroinflammation, EEG biomarkers.
93 +1. **Axis 3:** **Neuroanatomoclinical Correlations** → MRI atrophy, PET.
105 105  
106 -**Biomedical Ontologies & Databases:**
95 +**Reference:** [[Tridimensional Classification Guide>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/classification.md]]
107 107  
108 -* **Human Phenotype Ontology (HPO)** for symptom annotation.
109 -* **Gene Ontology (GO)** for molecular and cellular processes.
110 -
111 -**Dimensionality Reduction and Interpretability:**
112 -
113 -* **Evaluate interpretability** using metrics like the **Area Under the Interpretability Curve (AUIC)**.
114 -* **Leverage [[DEIBO>>https://github.com/Mellandd/DEIBO]] (Data-driven Embedding Interpretation Based on Ontologies)** to connect model dimensions to ontology concepts.
115 -
116 -**Neuroimaging & EEG/MEG Data:**
117 -
118 -* **MRI volumetric measures** for brain atrophy tracking.
119 -* **EEG functional connectivity patterns** (AI-Mind).
120 -
121 -**Clinical & Biomarker Data:**
122 -
123 -* **CSF biomarkers** (Amyloid-beta, Tau, Neurofilament Light).
124 -* **Sleep monitoring and actigraphy data** (ADIS).
125 -
126 -**Federated Learning Integration:**
127 -
128 -* **Secure multi-center data harmonization** (PROMINENT).
129 -
130 130  ----
131 131  
132 -==== **Annotation System for Multi-Modal Data** ====
99 +== **Data Security, Compliance & Federated Learning** ==
133 133  
134 -To ensure **structured integration of diverse datasets**, **Neurodiagnoses** will implement an **AI-driven annotation system**, which will:
101 +✔ **Privacy-Preserving AI**: Implements **Federated Learning**, ensuring that patient data **never leaves** local institutions.
102 +✔ **Secure Data Access**: Data remains **stored in EBRAINS MIP servers** using **differential privacy techniques.**
103 +✔ **Ethical & GDPR Compliance**: Data-sharing agreements **must be signed** before use.
135 135  
136 -* **Assign standardized metadata tags** to diagnostic features.
137 -* **Provide contextual explanations** for AI-based classifications.
138 -* **Track temporal disease progression annotations** to identify long-term trends.
105 +**Reference:** [[Data Protection & Federated Learning>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/security.md]]
139 139  
140 140  ----
141 141  
142 -== **2. AI-Based Analysis** ==
109 +== **Data Processing & Integration with Clinica.Run** ==
143 143  
144 -==== **Machine Learning & Deep Learning Models** ====
111 +Neurodiagnoses now supports **Clinica.Run**, an **open-source neuroimaging platform** for **multimodal data processing.**
145 145  
146 -**Risk Prediction Models:**
113 +=== **How It Works** ===
147 147  
148 -* **LETHE’s cognitive risk prediction model** integrated into the annotation framework.
115 +✔ **Neuroimaging Preprocessing**: MRI, PET, EEG data is preprocessed using **Clinica.Run pipelines.**
116 +✔ **Automated Biomarker Extraction**: Extracts volumetric, metabolic, and functional biomarkers.
117 +✔ **Data Security & Compliance**: Clinica.Run is **GDPR & HIPAA-compliant.**
149 149  
150 -**Biomarker Classification & Probabilistic Imputation:**
119 +=== **Implementation Steps** ===
151 151  
152 -* **KNN Imputer** and **Bayesian models** used for handling **missing biomarker data**.
121 +1. Install **Clinica.Run** dependencies.
122 +1. Configure **Clinica.Run pipeline** in clinica_run_config.json.
123 +1. Run **biomarker extraction pipelines** for AI-based diagnostics.
153 153  
154 -**Neuroimaging Feature Extraction:**
125 +**Reference:** [[Clinica.Run Documentation>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/clinica_run.md]]
155 155  
156 -* **MRI & EEG data** annotated with **neuroanatomical feature labels**.
157 -
158 -==== **AI-Powered Annotation System** ====
159 -
160 -* Uses **SHAP-based interpretability tools** to explain model decisions.
161 -* Generates **automated clinical annotations** in structured reports.
162 -* Links findings to **standardized medical ontologies** (e.g., **SNOMED, HPO**).
163 -
164 164  ----
165 165  
166 -== **3. Diagnostic Framework & Clinical Decision Support** ==
129 +== **Collaborative Development & Research** ==
167 167  
168 -==== **Tridimensional Diagnostic Axes** ====
131 +**We Use GitHub to Develop AI Models & Store Research Data**
169 169  
170 -**Axis 1: Etiology (Pathogenic Mechanisms)**
133 +* **GitHub Repository:** AI model training scripts.
134 +* **GitHub Issues:** Tracks ongoing research questions.
135 +* **GitHub Wiki:** Project documentation & user guides.
171 171  
172 -* Classification based on **genetic markers, cellular pathways, and environmental risk factors**.
173 -* **AI-assisted annotation** provides **causal interpretations** for clinical use.
137 +**We Use EBRAINS for Data & Collaboration**
174 174  
175 -**Axis 2: Molecular Markers & Biomarkers**
139 +* **EBRAINS Buckets:** Large-scale neuroimaging and biomarker storage.
140 +* **EBRAINS Jupyter Notebooks:** Cloud-based AI model execution.
141 +* **EBRAINS Wiki:** Research documentation and updates.
176 176  
177 -* **Integration of CSF, blood, and neuroimaging biomarkers**.
178 -* **Structured annotation** highlights **biological pathways linked to diagnosis**.
143 +**Join the Project Forum:** [[GitHub Discussions>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/discussions]]
179 179  
180 -**Axis 3: Neuroanatomoclinical Correlations**
181 -
182 -* **MRI and EEG data** provide anatomical and functional insights.
183 -* **AI-generated progression maps** annotate **brain structure-function relationships**.
184 -
185 185  ----
186 186  
187 -== **4. Computational Workflow & Annotation Pipelines** ==
147 +**For Additional Documentation:**
188 188  
189 -==== **Data Processing Steps** ====
149 +* **GitHub Repository:** [[Neurodiagnoses AI Models>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses]]
150 +* **EBRAINS Wiki:** [[Neurodiagnoses Research Collaboration>>url:https://wiki.ebrains.eu/bin/view/Collabs/neurodiagnoses/]]
190 190  
191 -**Data Ingestion:**
192 -
193 -* **Harmonized datasets** stored in **EBRAINS Bucket**.
194 -* **Preprocessing pipelines** clean and standardize data.
195 -
196 -**Feature Engineering:**
197 -
198 -* **AI models** extract **clinically relevant patterns** from **EEG, MRI, and biomarkers**.
199 -
200 -**AI-Generated Annotations:**
201 -
202 -* **Automated tagging** of diagnostic features in **structured reports**.
203 -* **Explainability modules (SHAP, LIME)** ensure transparency in predictions.
204 -
205 -**Clinical Decision Support Integration:**
206 -
207 -* **AI-annotated findings** fed into **interactive dashboards**.
208 -* **Clinicians can adjust, validate, and modify annotations**.
209 -
210 210  ----
211 211  
212 -== **5. Validation & Real-World Testing** ==
213 -
214 -==== **Prospective Clinical Study** ====
215 -
216 -* **Multi-center validation** of AI-based **annotations & risk stratifications**.
217 -* **Benchmarking against clinician-based diagnoses**.
218 -* **Real-world testing** of AI-powered **structured reporting**.
219 -
220 -==== **Quality Assurance & Explainability** ====
221 -
222 -* **Annotations linked to structured knowledge graphs** for improved transparency.
223 -* **Interactive annotation editor** allows clinicians to validate AI outputs.
224 -
225 -----
226 -
227 -== **6. Collaborative Development** ==
228 -
229 -The project is **open to contributions** from **researchers, clinicians, and developers**.
230 -
231 -**Key tools include:**
232 -
233 -* **Jupyter Notebooks**: For data analysis and pipeline development.
234 -** Example: **probabilistic imputation**
235 -* **Wiki Pages**: For documenting methods and results.
236 -* **Drive and Bucket**: For sharing code, data, and outputs.
237 -* **Collaboration with related projects**:
238 -** Example: **Beyond the hype: AI in dementia – from early risk detection to disease treatment**
239 -
240 -----
241 -
242 -== **7. Tools and Technologies** ==
243 -
244 -==== **Programming Languages:** ====
245 -
246 -* **Python** for AI and data processing.
247 -
248 -==== **Frameworks:** ====
249 -
250 -* **TensorFlow** and **PyTorch** for machine learning.
251 -* **Flask** or **FastAPI** for backend services.
252 -
253 -==== **Visualization:** ====
254 -
255 -* **Plotly** and **Matplotlib** for interactive and static visualizations.
256 -
257 -==== **EBRAINS Services:** ====
258 -
259 -* **Collaboratory Lab** for running Notebooks.
260 -* **Buckets** for storing large datasets.
154 +**Neurodiagnoses is Open for Contributions – Join Us Today!**