Warning:  Due to planned infrastructure maintenance, the EBRAINS Wiki and EBRAINS Support system will be unavailable for up to three days starting Monday, 14 July. During this period, both services will be inaccessible, and any emails sent to the support address will not be received.

Attention: We are currently experiencing some issues with the EBRAINS Drive. Please bear with us as we fix this issue. We apologise for any inconvenience caused.


Changes for page Methodology

Last modified by manuelmenendez on 2025/03/14 08:31

From version 19.1
edited by manuelmenendez
on 2025/02/14 13:57
Change comment: There is no comment for this version
To version 7.1
edited by manuelmenendez
on 2025/02/01 14:11
Change comment: There is no comment for this version

Summary

Details

Page properties
Content
... ... @@ -1,207 +1,189 @@
1 -**# Neurodiagnoses AI: Multimodal AI for Neurodiagnostic Predictions**
1 +==== **Overview** ====
2 2  
3 -## **Project Overview**
4 -Neurodiagnoses AI implements AI-driven diagnostic and prognostic models for central nervous system (CNS) disorders, adapting the Florey Dementia Index (FDI) methodology to a broader set of neurological conditions. The approach integrates **multimodal data sources** (EEG, neuroimaging, biomarkers, and genetics) and employs **machine learning models** to provide **explainable, real-time diagnostic insights**.##
3 +This project develops a **tridimensional diagnostic framework** for **CNS diseases**, incorporating **AI-powered annotation tools** to improve **interpretability, standardization, and clinical utility**. The methodology integrates **multi-modal data**, including **genetic, neuroimaging, neurophysiological, and biomarker datasets**, and applies **machine learning models** to generate **structured, explainable diagnostic outputs**.
5 5  
6 -## **How to Use External Databases in Neurodiagnoses**
7 -To enhance diagnostic accuracy, Neurodiagnoses integrates data from multiple biomedical and neurological research databases. Researchers can follow these steps to access, prepare, and integrate data into the Neurodiagnoses framework.##
5 +=== **Workflow** ===
8 8  
9 -### **Potential Data Sources**
10 -Neurodiagnoses maintains an updated list of potential biomedical databases relevant to neurodegenerative diseases. ##
7 +1. (((
8 +**We Use GitHub for AI Development**
11 11  
12 -**Reference: List of Potential Databases**
13 -- **ADNI**: Alzheimer's Disease data ([ADNI](https://adni.loni.usc.edu))
14 -- **PPMI**: Parkinson’s Disease Imaging and biospecimens ([PPMI](https://www.ppmi-info.org))
15 -- **GP2**: Whole-genome sequencing for PD ([GP2](https://gp2.org))
16 -- **Enroll-HD**: Huntington’s Disease Clinical and genetic data ([Enroll-HD](https://www.enroll-hd.org))
17 -- **GAAIN**: Multi-source Alzheimer’s data aggregation ([GAAIN](https://gaain.org))
18 -- **UK Biobank**: Population-wide genetic, imaging, and health records ([UK Biobank](https://www.ukbiobank.ac.uk))
19 -- **DPUK**: Dementia and Aging data ([DPUK](https://www.dementiasplatform.uk))
20 -- **PRION Registry**: Prion Diseases clinical and genetic data ([PRION Registry](https://prionregistry.org))
21 -- **DECIPHER**: Rare genetic disorder genomic variants ([DECIPHER](https://decipher.sanger.ac.uk))
10 +* Create a **GitHub repository** for AI scripts and models.
11 +* Use **GitHub Projects** to manage research milestones.
12 +)))
13 +1. (((
14 +**We Use EBRAINS for Data & Collaboration**
22 22  
23 -### **1. Register for Access**
24 -- Each external database requires **individual registration** and access approval.
25 -- Ensure compliance with **ethical approvals** and **data usage agreements** before integrating datasets into Neurodiagnoses.
26 -- Some repositories may require a **Data Usage Agreement (DUA)** for sensitive medical data.##
16 +* Store **biomarker and neuroimaging data** in **EBRAINS Buckets**.
17 +* Run **Jupyter Notebooks** in **EBRAINS Lab** to test AI models.
18 +* Use **EBRAINS Wiki** for structured documentation and research discussion.
19 +)))
27 27  
28 -### **2. Download & Prepare Data**
29 -- Download datasets while adhering to database usage policies.
30 -- Ensure files meet **Neurodiagnoses format requirements**:
31 - - **Tabular Data**: `.csv`, `.tsv`
32 - - **Neuroimaging Data**: `.nii`, `.dcm`
33 - - **Genomic Data**: `.fasta`, `.vcf`
34 - - **Clinical Metadata**: `.json`, `.xml`##
21 +----
35 35  
36 -- **Mandatory Fields for Integration**:
37 - - **Subject ID**: Unique patient identifier
38 - - **Diagnosis**: Standardized disease classification
39 - - **Biomarkers**: CSF, plasma, or imaging biomarkers
40 - - **Genetic Data**: Whole-genome or exome sequencing
41 - - **Neuroimaging Metadata**: MRI/PET acquisition parameters
23 +=== **1. Data Integration** ===
42 42  
43 -### **3. Upload Data to Neurodiagnoses**
44 -**Option 1: Upload to EBRAINS Bucket**
45 -- Location: **EBRAINS Neurodiagnoses Bucket**
46 -- Ensure correct **metadata tagging** before submission.##
25 +==== **Data Sources** ====
47 47  
48 - **Option 2: Contribute via GitHub Repository**
49 -- Location: **GitHub Data Repository**
50 -- Create a new folder under `/data/` and include a **dataset description**.
51 -- For large datasets, contact project administrators before uploading.
27 +**Biomedical Ontologies & Databases:**
52 52  
53 -### **4. Integrate Data into AI Models**
54 -- Open **Jupyter Notebooks** on EBRAINS to run **preprocessing scripts**.
55 -- Standardize **neuroimaging and biomarker formats** using harmonization tools.
56 -- Use **machine learning models** to handle missing data and feature extraction.
57 -- Train AI models with **newly integrated patient cohorts**.##
29 +* **Human Phenotype Ontology (HPO)** for symptom annotation.
30 +* **Gene Ontology (GO)** for molecular and cellular processes.
58 58  
59 -**Reference**: See `docs/data_processing.md` for detailed instructions.
32 +**Dimensionality Reduction and Interpretability:**
60 60  
61 -## **Collaboration & Partnerships**##
62 -# **Partnering with Data Providers**
63 -Neurodiagnoses seeks partnerships with data repositories to:
64 -- Enable **API-based data integration** for real-time processing.
65 -- Co-develop **harmonized AI-ready datasets** with standardized annotations.
66 -- Secure **funding opportunities** through joint grant applications.
34 +* **Evaluate interpretability** using metrics like the **Area Under the Interpretability Curve (AUIC)**.
35 +* **Leverage DEIBO (Data-driven Embedding Interpretation Based on Ontologies)** to connect model dimensions to ontology concepts.
67 67  
68 -**Interested in Partnering?**
69 -- If you represent a research consortium or database provider, reach out to explore data-sharing agreements.
70 -- **Contact**: info@neurodiagnoses.com
37 +**Neuroimaging & EEG/MEG Data:**
71 71  
72 -## **Final Notes**
73 -Neurodiagnoses continuously expands its data ecosystem to support AI-driven clinical decision-making. Researchers and institutions are encouraged to contribute **new datasets and methodologies**.##
39 +* **MRI volumetric measures** for brain atrophy tracking.
40 +* **EEG functional connectivity patterns** (AI-Mind).
74 74  
75 -For additional technical documentation:
76 -- **GitHub Repository**: [Neurodiagnoses GitHub](https://github.com/neurodiagnoses)
77 -- **EBRAINS Collaboration Page**: [EBRAINS Neurodiagnoses](https://ebrains.eu/collabs/neurodiagnoses)
42 +**Clinical & Biomarker Data:**
78 78  
79 -If you experience issues integrating data, **open a GitHub Issue** or consult the **EBRAINS Neurodiagnoses Forum**.
44 +* **CSF biomarkers** (Amyloid-beta, Tau, Neurofilament Light).
45 +* **Sleep monitoring and actigraphy data** (ADIS).
80 80  
81 -== **How to Use External Databases in Neurodiagnoses** ==
47 +**Federated Learning Integration:**
82 82  
83 -To enhance the accuracy of our diagnostic models, Neurodiagnoses integrates data from multiple biomedical and neurological research databases. If you are a researcher, follow these steps to access, prepare, and integrate data into the Neurodiagnoses framework.
49 +* **Secure multi-center data harmonization** (PROMINENT).
84 84  
85 -=== **Potential Data Sources** ===
51 +----
86 86  
87 -Neurodiagnoses maintains an updated list of potential biomedical databases relevant to neurodegenerative diseases.
53 +==== **Annotation System for Multi-Modal Data** ====
88 88  
89 -* Reference: [[List of Potential Databases>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/data/sources/list_of_potential_databases]]
55 +To ensure **structured integration of diverse datasets**, **Neurodiagnoses** will implement an **AI-driven annotation system**, which will:
90 90  
91 -=== **1. Register for Access** ===
57 +* **Assign standardized metadata tags** to diagnostic features.
58 +* **Provide contextual explanations** for AI-based classifications.
59 +* **Track temporal disease progression annotations** to identify long-term trends.
92 92  
93 -Each external database requires individual registration and access approval. Follow the official guidelines of each database provider.
61 +----
94 94  
95 -* Ensure that you have completed all ethical approvals and data access agreements before integrating datasets into Neurodiagnoses.
96 -* Some repositories require a Data Usage Agreement (DUA) before downloading sensitive medical data.
63 +=== **2. AI-Based Analysis** ===
97 97  
98 -=== **2. Download & Prepare Data** ===
65 +==== **Machine Learning & Deep Learning Models** ====
99 99  
100 -Once access is granted, download datasets while complying with data usage policies. Ensure that the files meet Neurodiagnoses’ format requirements for smooth integration.
67 +**Risk Prediction Models:**
101 101  
102 -==== **Supported File Formats** ====
69 +* **LETHE’s cognitive risk prediction model** integrated into the annotation framework.
103 103  
104 -* Tabular Data: .csv, .tsv
105 -* Neuroimaging Data: .nii, .dcm
106 -* Genomic Data: .fasta, .vcf
107 -* Clinical Metadata: .json, .xml
71 +**Biomarker Classification & Probabilistic Imputation:**
108 108  
109 -==== **Mandatory Fields for Integration** ====
73 +* **KNN Imputer** and **Bayesian models** used for handling **missing biomarker data**.
110 110  
111 -|=Field Name|=Description
112 -|Subject ID|Unique patient identifier
113 -|Diagnosis|Standardized disease classification
114 -|Biomarkers|CSF, plasma, or imaging biomarkers
115 -|Genetic Data|Whole-genome or exome sequencing
116 -|Neuroimaging Metadata|MRI/PET acquisition parameters
75 +**Neuroimaging Feature Extraction:**
117 117  
118 -=== **3. Upload Data to Neurodiagnoses** ===
77 +* **MRI & EEG data** annotated with **neuroanatomical feature labels**.
119 119  
120 -Once preprocessed, data can be uploaded to EBRAINS or GitHub.
79 +==== **AI-Powered Annotation System** ====
121 121  
122 -* (((
123 -**Option 1: Upload to EBRAINS Bucket**
81 +* Uses **SHAP-based interpretability tools** to explain model decisions.
82 +* Generates **automated clinical annotations** in structured reports.
83 +* Links findings to **standardized medical ontologies** (e.g., **SNOMED, HPO**).
124 124  
125 -* Location: [[EBRAINS Neurodiagnoses Bucket>>url:https://wiki.ebrains.eu/bin/view/Collabs/neurodiagnoses/Bucket]]
126 -* Ensure correct metadata tagging before submission.
127 -)))
128 -* (((
129 -**Option 2: Contribute via GitHub Repository**
85 +----
130 130  
131 -* Location: [[GitHub Data Repository>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/tree/main/data]]
132 -* Create a new folder under /data/ and include dataset description.
133 -)))
87 +=== **3. Diagnostic Framework & Clinical Decision Support** ===
134 134  
135 -//Note: For large datasets, please contact the project administrators before uploading.//
89 +==== **Tridimensional Diagnostic Axes** ====
136 136  
137 -=== **4. Integrate Data into AI Models** ===
91 +**Axis 1: Etiology (Pathogenic Mechanisms)**
138 138  
139 -Once uploaded, datasets must be harmonized and formatted before AI model training.
93 +* Classification based on **genetic markers, cellular pathways, and environmental risk factors**.
94 +* **AI-assisted annotation** provides **causal interpretations** for clinical use.
140 140  
141 -==== **Steps for Data Integration** ====
96 +**Axis 2: Molecular Markers & Biomarkers**
142 142  
143 -* Open Jupyter Notebooks on EBRAINS to run preprocessing scripts.
144 -* Standardize neuroimaging and biomarker formats using harmonization tools.
145 -* Use machine learning models to handle missing data and feature extraction.
146 -* Train AI models with newly integrated patient cohorts.
147 -* Reference: [[Detailed instructions can be found in docs/data_processing.md>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/data_processing.md]].
98 +* **Integration of CSF, blood, and neuroimaging biomarkers**.
99 +* **Structured annotation** highlights **biological pathways linked to diagnosis**.
148 148  
101 +**Axis 3: Neuroanatomoclinical Correlations**
102 +
103 +* **MRI and EEG data** provide anatomical and functional insights.
104 +* **AI-generated progression maps** annotate **brain structure-function relationships**.
105 +
149 149  ----
150 150  
151 -== **Database Sources Table** ==
108 +=== **4. Computational Workflow & Annotation Pipelines** ===
152 152  
153 -=== **Where to Insert This** ===
110 +==== **Data Processing Steps** ====
154 154  
155 -* GitHub: [[docs/data_sources.md>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/data_sources.md]]
156 -* EBRAINS Wiki: Collabs/neurodiagnoses/Data Sources
112 +**Data Ingestion:**
157 157  
158 -=== **Key Databases for Neurodiagnoses** ===
114 +* **Harmonized datasets** stored in **EBRAINS Bucket**.
115 +* **Preprocessing pipelines** clean and standardize data.
159 159  
160 -|=Database|=Focus Area|=Data Type|=Access Link
161 -|ADNI|Alzheimer's Disease|MRI, PET, CSF, cognitive tests|ADNI
162 -|PPMI|Parkinson’s Disease|Imaging, biospecimens|[[PPMI>>url:https://www.ppmi-info.org/]]
163 -|GP2|Genetic Data for PD|Whole-genome sequencing|[[GP2>>url:https://gp2.org/]]
164 -|Enroll-HD|Huntington’s Disease|Clinical, genetic, imaging|[[Enroll-HD>>url:https://enroll-hd.org/]]
165 -|GAAIN|Alzheimer's & Cognitive Decline|Multi-source data aggregation|[[GAAIN>>url:https://www.gaain.org/]]
166 -|UK Biobank|Population-wide studies|Genetic, imaging, health records|[[UK Biobank>>url:https://www.ukbiobank.ac.uk/]]
167 -|DPUK|Dementia & Aging|Imaging, genetics, lifestyle factors|[[DPUK>>url:https://www.dementiasplatform.uk/]]
168 -|PRION Registry|Prion Diseases|Clinical and genetic data|[[PRION Registry>>url:https://www.prionalliance.org/]]
169 -|DECIPHER|Rare Genetic Disorders|Genomic variants|DECIPHER
117 +**Feature Engineering:**
170 170  
171 -If you know a relevant dataset, submit a proposal in [[GitHub Issues>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/issues]].
119 +* **AI models** extract **clinically relevant patterns** from **EEG, MRI, and biomarkers**.
172 172  
121 +**AI-Generated Annotations:**
122 +
123 +* **Automated tagging** of diagnostic features in **structured reports**.
124 +* **Explainability modules (SHAP, LIME)** ensure transparency in predictions.
125 +
126 +**Clinical Decision Support Integration:**
127 +
128 +* **AI-annotated findings** fed into **interactive dashboards**.
129 +* **Clinicians can adjust, validate, and modify annotations**.
130 +
173 173  ----
174 174  
175 -== **Collaboration & Partnerships** ==
133 +=== **5. Validation & Real-World Testing** ===
176 176  
177 -=== **Where to Insert This** ===
135 +==== **Prospective Clinical Study** ====
178 178  
179 -* GitHub: [[docs/collaboration.md>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/collaboration.md]]
180 -* EBRAINS Wiki: Collabs/neurodiagnoses/Collaborations
137 +* **Multi-center validation** of AI-based **annotations & risk stratifications**.
138 +* **Benchmarking against clinician-based diagnoses**.
139 +* **Real-world testing** of AI-powered **structured reporting**.
181 181  
182 -=== **Partnering with Data Providers** ===
141 +==== **Quality Assurance & Explainability** ====
183 183  
184 -Beyond using existing datasets, Neurodiagnoses seeks partnerships with data repositories to:
143 +* **Annotations linked to structured knowledge graphs** for improved transparency.
144 +* **Interactive annotation editor** allows clinicians to validate AI outputs.
185 185  
186 -* Enable direct API-based data integration for real-time processing.
187 -* Co-develop harmonized AI-ready datasets with standardized annotations.
188 -* Secure funding opportunities through joint grant applications.
146 +----
189 189  
190 -=== **Interested in Partnering?** ===
148 +=== **6. Collaborative Development** ===
191 191  
192 -If you represent a research consortium or database provider, reach out to explore data-sharing agreements.
150 +The project is **open to contributions** from **researchers, clinicians, and developers**.
193 193  
194 -* Contact: [[info@neurodiagnoses.com>>mailto:info@neurodiagnoses.com]]
152 +**Key tools include:**
195 195  
154 +* **Jupyter Notebooks**: For data analysis and pipeline development.
155 +** Example: **probabilistic imputation**
156 +* **Wiki Pages**: For documenting methods and results.
157 +* **Drive and Bucket**: For sharing code, data, and outputs.
158 +* **Collaboration with related projects**:
159 +** Example: **Beyond the hype: AI in dementia – from early risk detection to disease treatment**
160 +
196 196  ----
197 197  
198 -== **Final Notes** ==
163 +=== **7. Tools and Technologies** ===
199 199  
200 -Neurodiagnoses continuously expands its data ecosystem to support AI-driven clinical decision-making. Researchers and institutions are encouraged to contribute new datasets and methodologies.
165 +==== **Programming Languages:** ====
201 201  
202 -For additional technical documentation:
167 +* **Python** for AI and data processing.
203 203  
204 -* [[GitHub Repository>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses]]
205 -* [[EBRAINS Collaboration Page>>url:https://wiki.ebrains.eu/bin/view/Collabs/neurodiagnoses/]]
169 +==== **Frameworks:** ====
206 206  
207 -If you experience issues integrating data, open a [[GitHub Issue>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/issues]] or consult the EBRAINS Neurodiagnoses Forum.
171 +* **TensorFlow** and **PyTorch** for machine learning.
172 +* **Flask** or **FastAPI** for backend services.
173 +
174 +==== **Visualization:** ====
175 +
176 +* **Plotly** and **Matplotlib** for interactive and static visualizations.
177 +
178 +==== **EBRAINS Services:** ====
179 +
180 +* **Collaboratory Lab** for running Notebooks.
181 +* **Buckets** for storing large datasets.
182 +
183 +----
184 +
185 +=== **Why This Matters** ===
186 +
187 +* **The annotation system ensures that AI-generated insights are structured, interpretable, and clinically meaningful.**
188 +* **It enables real-time tracking of disease progression across the three diagnostic axes.**
189 +* **It facilitates integration with electronic health records and decision-support tools, improving AI adoption in clinical workflows.**