Attention: The EBRAINS IDM/IAM will be down tomorrow, Wednesday 17nd December, from 17:00 CET for up to 30 minutes for maintenance. Please be aware that this will affect all services that require login or authentication.


Changes for page Methodology

Last modified by manuelmenendez on 2025/03/14 08:31

From version 18.1
edited by manuelmenendez
on 2025/02/13 12:52
Change comment: There is no comment for this version
To version 12.2
edited by manuelmenendez
on 2025/02/09 09:54
Change comment: There is no comment for this version

Summary

Details

Page properties
Content
... ... @@ -1,133 +1,273 @@
1 -== **Overview** ==
1 +==== **Overview** ====
2 2  
3 -Neurodiagnoses develops a tridimensional diagnostic framework for CNS diseases, incorporating AI-powered annotation tools to improve interpretability, standardization, and clinical utility. The methodology integrates multi-modal data, including genetic, neuroimaging, neurophysiological, and biomarker datasets, and applies machine learning models to generate structured, explainable diagnostic outputs.
3 +This project develops a **tridimensional diagnostic framework** for **CNS diseases**, incorporating **AI-powered annotation tools** to improve **interpretability, standardization, and clinical utility**. The methodology integrates **multi-modal data**, including **genetic, neuroimaging, neurophysiological, and biomarker datasets**, and applies **machine learning models** to generate **structured, explainable diagnostic outputs**.
4 4  
5 +=== **Workflow** ===
6 +
7 +1. (((
8 +**We Use GitHub to [[Store and develop AI models, scripts, and annotation pipelines.>>https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/discussions]]**
9 +
10 +* Create a **GitHub repository** for AI scripts and models.
11 +* Use **GitHub Projects** to manage research milestones.
12 +)))
13 +1. (((
14 +**We Use EBRAINS for Data & Collaboration**
15 +
16 +* Store **biomarker and neuroimaging data** in **EBRAINS Buckets**.
17 +* Run **Jupyter Notebooks** in **EBRAINS Lab** to test AI models.
18 +* Use **EBRAINS Wiki** for structured documentation and research discussion.
19 +)))
20 +
5 5  ----
6 6  
7 -== **How to Use External Databases in Neurodiagnoses** ==
23 +=== **1. Data Integration** ===
8 8  
9 -To enhance the accuracy of our diagnostic models, Neurodiagnoses integrates data from multiple biomedical and neurological research databases. If you are a researcher, follow these steps to access, prepare, and integrate data into the Neurodiagnoses framework.
25 +== Overview ==
10 10  
11 -=== **Potential Data Sources** ===
12 12  
13 -Neurodiagnoses maintains an updated list of potential biomedical databases relevant to neurodegenerative diseases.
28 +Neurodiagnoses integrates clinical data via the **EBRAINS Medical Informatics Platform (MIP)**. MIP federates decentralized clinical data, allowing Neurodiagnoses to securely access and process sensitive information for AI-based diagnostics.
14 14  
15 -* Reference: [[List of Potential Databases>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/data/sources/list_of_potential_databases]]
30 +== How It Works ==
16 16  
17 -=== **1. Register for Access** ===
18 18  
19 -Each external database requires individual registration and access approval. Follow the official guidelines of each database provider.
33 +1. (((
34 +**Authentication & API Access:**
20 20  
21 -* Ensure that you have completed all ethical approvals and data access agreements before integrating datasets into Neurodiagnoses.
22 -* Some repositories require a Data Usage Agreement (DUA) before downloading sensitive medical data.
36 +* Users must have an **EBRAINS account**.
37 +* Neurodiagnoses uses **secure API endpoints** to fetch clinical data (e.g., from the **Federation for Dementia**).
38 +)))
39 +1. (((
40 +**Data Mapping & Harmonization:**
23 23  
24 -=== **2. Download & Prepare Data** ===
42 +* Retrieved data is **normalized** and converted to standard formats (.csv, .json).
43 +* Data from **multiple sources** is harmonized to ensure consistency for AI processing.
44 +)))
45 +1. (((
46 +**Security & Compliance:**
25 25  
26 -Once access is granted, download datasets while complying with data usage policies. Ensure that the files meet Neurodiagnoses’ format requirements for smooth integration.
48 +* All data access is **logged and monitored**.
49 +* Data remains on **MIP servers** using **federated learning techniques** when possible.
50 +* Access is granted only after signing a **Data Usage Agreement (DUA)**.
51 +)))
27 27  
28 -==== **Supported File Formats** ====
53 +== Implementation Steps ==
29 29  
30 -* Tabular Data: .csv, .tsv
31 -* Neuroimaging Data: .nii, .dcm
32 -* Genomic Data: .fasta, .vcf
33 -* Clinical Metadata: .json, .xml
34 34  
35 -==== **Mandatory Fields for Integration** ====
56 +1. Clone the repository.
57 +1. Configure your **EBRAINS API credentials** in mip_integration.py.
58 +1. Run the script to **download and harmonize clinical data**.
59 +1. Process the data for **AI model training**.
36 36  
37 -|=Field Name|=Description
38 -|Subject ID|Unique patient identifier
39 -|Diagnosis|Standardized disease classification
40 -|Biomarkers|CSF, plasma, or imaging biomarkers
41 -|Genetic Data|Whole-genome or exome sequencing
42 -|Neuroimaging Metadata|MRI/PET acquisition parameters
61 +For more detailed instructions, please refer to the **[[MIP Documentation>>url:https://mip.ebrains.eu/]]**.
43 43  
44 -=== **3. Upload Data to Neurodiagnoses** ===
63 +----
45 45  
46 -Once preprocessed, data can be uploaded to EBRAINS or GitHub.
65 += Data Processing & Integration with Clinica.Run =
47 47  
48 -* (((
49 -**Option 1: Upload to EBRAINS Bucket**
50 50  
51 -* Location: [[EBRAINS Neurodiagnoses Bucket>>url:https://wiki.ebrains.eu/bin/view/Collabs/neurodiagnoses/Bucket]]
52 -* Ensure correct metadata tagging before submission.
68 +== Overview ==
69 +
70 +
71 +Neurodiagnoses now supports **Clinica.Run**, an open-source neuroimaging platform designed for **multimodal data processing and reproducible neuroscience workflows**.
72 +
73 +== How It Works ==
74 +
75 +
76 +1. (((
77 +**Neuroimaging Preprocessing:**
78 +
79 +* MRI, PET, EEG data is preprocessed using **Clinica.Run pipelines**.
80 +* Supports **longitudinal and cross-sectional analyses**.
53 53  )))
54 -* (((
55 -**Option 2: Contribute via GitHub Repository**
82 +1. (((
83 +**Automated Biomarker Extraction:**
56 56  
57 -* Location: [[GitHub Data Repository>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/tree/main/data]]
58 -* Create a new folder under /data/ and include dataset description.
85 +* Standardized extraction of **volumetric, metabolic, and functional biomarkers**.
86 +* Integration with machine learning models in Neurodiagnoses.
59 59  )))
88 +1. (((
89 +**Data Security & Compliance:**
60 60  
61 -//Note: For large datasets, please contact the project administrators before uploading.//
91 +* Clinica.Run operates in **compliance with GDPR and HIPAA**.
92 +* Neuroimaging data remains **within the original storage environment**.
93 +)))
62 62  
63 -=== **4. Integrate Data into AI Models** ===
95 +== Implementation Steps ==
64 64  
65 -Once uploaded, datasets must be harmonized and formatted before AI model training.
66 66  
67 -==== **Steps for Data Integration** ====
98 +1. Install **Clinica.Run** dependencies.
99 +1. Configure your **Clinica.Run pipeline** in clinica_run_config.json.
100 +1. Run the pipeline for **preprocessing and biomarker extraction**.
101 +1. Use processed neuroimaging data for **AI-driven diagnostics** in Neurodiagnoses.
68 68  
69 -* Open Jupyter Notebooks on EBRAINS to run preprocessing scripts.
70 -* Standardize neuroimaging and biomarker formats using harmonization tools.
71 -* Use machine learning models to handle missing data and feature extraction.
72 -* Train AI models with newly integrated patient cohorts.
73 -* Reference: [[Detailed instructions can be found in docs/data_processing.md>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/data_processing.md]].
103 +For further information, refer to **[[Clinica.Run Documentation>>url:https://clinica.run/]]**.
74 74  
105 +==== ====
106 +
107 +==== **Data Sources** ====
108 +
109 +[[List of potential sources of databases>>https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/data/sources/list_of_potential_databases]]
110 +
111 +**Biomedical Ontologies & Databases:**
112 +
113 +* **Human Phenotype Ontology (HPO)** for symptom annotation.
114 +* **Gene Ontology (GO)** for molecular and cellular processes.
115 +
116 +**Dimensionality Reduction and Interpretability:**
117 +
118 +* **Evaluate interpretability** using metrics like the **Area Under the Interpretability Curve (AUIC)**.
119 +* **Leverage [[DEIBO>>https://github.com/Mellandd/DEIBO]] (Data-driven Embedding Interpretation Based on Ontologies)** to connect model dimensions to ontology concepts.
120 +
121 +**Neuroimaging & EEG/MEG Data:**
122 +
123 +* **MRI volumetric measures** for brain atrophy tracking.
124 +* **EEG functional connectivity patterns** (AI-Mind).
125 +
126 +**Clinical & Biomarker Data:**
127 +
128 +* **CSF biomarkers** (Amyloid-beta, Tau, Neurofilament Light).
129 +* **Sleep monitoring and actigraphy data** (ADIS).
130 +
131 +**Federated Learning Integration:**
132 +
133 +* **Secure multi-center data harmonization** (PROMINENT).
134 +
75 75  ----
76 76  
77 -== **Database Sources Table** ==
137 +==== **Annotation System for Multi-Modal Data** ====
78 78  
79 -=== **Where to Insert This** ===
139 +To ensure **structured integration of diverse datasets**, **Neurodiagnoses** will implement an **AI-driven annotation system**, which will:
80 80  
81 -* GitHub: [[docs/data_sources.md>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/data_sources.md]]
82 -* EBRAINS Wiki: Collabs/neurodiagnoses/Data Sources
141 +* **Assign standardized metadata tags** to diagnostic features.
142 +* **Provide contextual explanations** for AI-based classifications.
143 +* **Track temporal disease progression annotations** to identify long-term trends.
83 83  
84 -=== **Key Databases for Neurodiagnoses** ===
145 +----
85 85  
86 -|=Database|=Focus Area|=Data Type|=Access Link
87 -|ADNI|Alzheimer's Disease|MRI, PET, CSF, cognitive tests|ADNI
88 -|PPMI|Parkinson’s Disease|Imaging, biospecimens|[[PPMI>>url:https://www.ppmi-info.org/]]
89 -|GP2|Genetic Data for PD|Whole-genome sequencing|[[GP2>>url:https://gp2.org/]]
90 -|Enroll-HD|Huntington’s Disease|Clinical, genetic, imaging|[[Enroll-HD>>url:https://enroll-hd.org/]]
91 -|GAAIN|Alzheimer's & Cognitive Decline|Multi-source data aggregation|[[GAAIN>>url:https://www.gaain.org/]]
92 -|UK Biobank|Population-wide studies|Genetic, imaging, health records|[[UK Biobank>>url:https://www.ukbiobank.ac.uk/]]
93 -|DPUK|Dementia & Aging|Imaging, genetics, lifestyle factors|[[DPUK>>url:https://www.dementiasplatform.uk/]]
94 -|PRION Registry|Prion Diseases|Clinical and genetic data|[[PRION Registry>>url:https://www.prionalliance.org/]]
95 -|DECIPHER|Rare Genetic Disorders|Genomic variants|DECIPHER
147 +=== **2. AI-Based Analysis** ===
96 96  
97 -If you know a relevant dataset, submit a proposal in [[GitHub Issues>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/issues]].
149 +==== **Machine Learning & Deep Learning Models** ====
98 98  
151 +**Risk Prediction Models:**
152 +
153 +* **LETHE’s cognitive risk prediction model** integrated into the annotation framework.
154 +
155 +**Biomarker Classification & Probabilistic Imputation:**
156 +
157 +* **KNN Imputer** and **Bayesian models** used for handling **missing biomarker data**.
158 +
159 +**Neuroimaging Feature Extraction:**
160 +
161 +* **MRI & EEG data** annotated with **neuroanatomical feature labels**.
162 +
163 +==== **AI-Powered Annotation System** ====
164 +
165 +* Uses **SHAP-based interpretability tools** to explain model decisions.
166 +* Generates **automated clinical annotations** in structured reports.
167 +* Links findings to **standardized medical ontologies** (e.g., **SNOMED, HPO**).
168 +
99 99  ----
100 100  
101 -== **Collaboration & Partnerships** ==
171 +=== **3. Diagnostic Framework & Clinical Decision Support** ===
102 102  
103 -=== **Where to Insert This** ===
173 +==== **Tridimensional Diagnostic Axes** ====
104 104  
105 -* GitHub: [[docs/collaboration.md>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/collaboration.md]]
106 -* EBRAINS Wiki: Collabs/neurodiagnoses/Collaborations
175 +**Axis 1: Etiology (Pathogenic Mechanisms)**
107 107  
108 -=== **Partnering with Data Providers** ===
177 +* Classification based on **genetic markers, cellular pathways, and environmental risk factors**.
178 +* **AI-assisted annotation** provides **causal interpretations** for clinical use.
109 109  
110 -Beyond using existing datasets, Neurodiagnoses seeks partnerships with data repositories to:
180 +**Axis 2: Molecular Markers & Biomarkers**
111 111  
112 -* Enable direct API-based data integration for real-time processing.
113 -* Co-develop harmonized AI-ready datasets with standardized annotations.
114 -* Secure funding opportunities through joint grant applications.
182 +* **Integration of CSF, blood, and neuroimaging biomarkers**.
183 +* **Structured annotation** highlights **biological pathways linked to diagnosis**.
115 115  
116 -=== **Interested in Partnering?** ===
185 +**Axis 3: Neuroanatomoclinical Correlations**
117 117  
118 -If you represent a research consortium or database provider, reach out to explore data-sharing agreements.
187 +* **MRI and EEG data** provide anatomical and functional insights.
188 +* **AI-generated progression maps** annotate **brain structure-function relationships**.
119 119  
120 -* Contact: [[info@neurodiagnoses.com>>mailto:info@neurodiagnoses.com]]
190 +----
121 121  
192 +=== **4. Computational Workflow & Annotation Pipelines** ===
193 +
194 +==== **Data Processing Steps** ====
195 +
196 +**Data Ingestion:**
197 +
198 +* **Harmonized datasets** stored in **EBRAINS Bucket**.
199 +* **Preprocessing pipelines** clean and standardize data.
200 +
201 +**Feature Engineering:**
202 +
203 +* **AI models** extract **clinically relevant patterns** from **EEG, MRI, and biomarkers**.
204 +
205 +**AI-Generated Annotations:**
206 +
207 +* **Automated tagging** of diagnostic features in **structured reports**.
208 +* **Explainability modules (SHAP, LIME)** ensure transparency in predictions.
209 +
210 +**Clinical Decision Support Integration:**
211 +
212 +* **AI-annotated findings** fed into **interactive dashboards**.
213 +* **Clinicians can adjust, validate, and modify annotations**.
214 +
122 122  ----
123 123  
124 -== **Final Notes** ==
217 +=== **5. Validation & Real-World Testing** ===
125 125  
126 -Neurodiagnoses continuously expands its data ecosystem to support AI-driven clinical decision-making. Researchers and institutions are encouraged to contribute new datasets and methodologies.
219 +==== **Prospective Clinical Study** ====
127 127  
128 -For additional technical documentation:
221 +* **Multi-center validation** of AI-based **annotations & risk stratifications**.
222 +* **Benchmarking against clinician-based diagnoses**.
223 +* **Real-world testing** of AI-powered **structured reporting**.
129 129  
130 -* [[GitHub Repository>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses]]
131 -* [[EBRAINS Collaboration Page>>url:https://wiki.ebrains.eu/bin/view/Collabs/neurodiagnoses/]]
225 +==== **Quality Assurance & Explainability** ====
132 132  
133 -If you experience issues integrating data, open a [[GitHub Issue>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/issues]] or consult the EBRAINS Neurodiagnoses Forum.
227 +* **Annotations linked to structured knowledge graphs** for improved transparency.
228 +* **Interactive annotation editor** allows clinicians to validate AI outputs.
229 +
230 +----
231 +
232 +=== **6. Collaborative Development** ===
233 +
234 +The project is **open to contributions** from **researchers, clinicians, and developers**.
235 +
236 +**Key tools include:**
237 +
238 +* **Jupyter Notebooks**: For data analysis and pipeline development.
239 +** Example: **probabilistic imputation**
240 +* **Wiki Pages**: For documenting methods and results.
241 +* **Drive and Bucket**: For sharing code, data, and outputs.
242 +* **Collaboration with related projects**:
243 +** Example: **Beyond the hype: AI in dementia – from early risk detection to disease treatment**
244 +
245 +----
246 +
247 +=== **7. Tools and Technologies** ===
248 +
249 +==== **Programming Languages:** ====
250 +
251 +* **Python** for AI and data processing.
252 +
253 +==== **Frameworks:** ====
254 +
255 +* **TensorFlow** and **PyTorch** for machine learning.
256 +* **Flask** or **FastAPI** for backend services.
257 +
258 +==== **Visualization:** ====
259 +
260 +* **Plotly** and **Matplotlib** for interactive and static visualizations.
261 +
262 +==== **EBRAINS Services:** ====
263 +
264 +* **Collaboratory Lab** for running Notebooks.
265 +* **Buckets** for storing large datasets.
266 +
267 +----
268 +
269 +=== **Why This Matters** ===
270 +
271 +* The annotation system ensures that AI-generated insights are structured, interpretable, and clinically meaningful.
272 +* It enables real-time tracking of disease progression across the three diagnostic axes.
273 +* It facilitates integration with electronic health records and decision-support tools, improving AI adoption in clinical workflows.