Warning:  Due to planned infrastructure maintenance, the EBRAINS Wiki and EBRAINS Support system will be unavailable for up to three days starting Monday, 14 July. During this period, both services will be inaccessible, and any emails sent to the support address will not be received.

Attention: We are currently experiencing some issues with the EBRAINS Drive. Please bear with us as we fix this issue. We apologise for any inconvenience caused.


Changes for page Methodology

Last modified by manuelmenendez on 2025/03/14 08:31

From version 17.1
edited by manuelmenendez
on 2025/02/09 13:01
Change comment: There is no comment for this version
To version 4.3
edited by manuelmenendez
on 2025/01/29 19:11
Change comment: There is no comment for this version

Summary

Details

Page properties
Content
... ... @@ -1,154 +1,109 @@
1 -== **Overview** ==
1 +=== **Overview** ===
2 2  
3 -Neurodiagnoses develops a **tridimensional diagnostic framework** for **CNS diseases**, incorporating **AI-powered annotation tools** to improve **interpretability, standardization, and clinical utility.**
3 +This section describes the step-by-step process used in the **Neurodiagnoses** project to develop a novel diagnostic framework for neurological diseases. The methodology integrates artificial intelligence (AI), biomedical ontologies, and computational neuroscience to create a structured, interpretable, and scalable diagnostic system.
4 4  
5 -This methodology integrates **multi-modal data**, including:
6 -**Genetic data** (whole-genome sequencing, polygenic risk scores).
7 -**Neuroimaging** (MRI, PET, EEG, MEG).
8 -**Neurophysiological data** (EEG-based biomarkers, sleep actigraphy).
9 -**CSF & Blood Biomarkers** (Amyloid-beta, Tau, Neurofilament Light).
10 -
11 -By applying **machine learning models**, Neurodiagnoses generates **structured, explainable diagnostic outputs** to assist **clinical decision-making** and **biomarker-driven patient stratification.**
12 -
13 13  ----
14 14  
15 -== **Data Integration & External Databases** ==
7 +=== **1. Data Integration** ===
16 16  
17 -=== **How to Use External Databases in Neurodiagnoses** ===
9 +==== **Data Sources** ====
18 18  
19 -Neurodiagnoses integrates data from multiple **biomedical and neurological research databases**. Researchers can follow these steps to **access, prepare, and integrate** data into the Neurodiagnoses framework.
11 +* **Biomedical Ontologies**:
12 +** Human Phenotype Ontology (HPO) for phenotypic abnormalities.
13 +** Gene Ontology (GO) for molecular and cellular processes.
14 +* **Neuroimaging Datasets**:
15 +** Example: Alzheimer’s Disease Neuroimaging Initiative (ADNI), OpenNeuro.
16 +* **Clinical and Biomarker Data**:
17 +** Anonymized clinical reports, molecular biomarkers, and test results.
20 20  
21 -**Potential Data Sources**
22 -**Reference:** [[List of Potential Databases>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/data/sources/list_of_potential_databases]]
23 23  
24 -=== **Register for Access** ===
20 +==== **Data Preprocessing** ====
25 25  
26 -Each **external database** requires **individual registration** and approval.
27 -✔️ Follow the official **data access guidelines** of each provider.
28 -✔️ Ensure compliance with **ethical approvals** and **data-sharing agreements (DUAs).**
22 +1. **Standardization**: Ensure all data sources are normalized to a common format.
23 +1. **Feature Selection**: Identify relevant features for diagnosis (e.g., biomarkers, imaging scores).
24 +1. **Data Cleaning**: Handle missing values and remove duplicates.
29 29  
30 -=== **Download & Prepare Data** ===
31 -
32 -Once access is granted, download datasets **following compliance guidelines** and **format requirements** for integration.
33 -
34 -**Supported File Formats**
35 -
36 -* **Tabular Data**: .csv, .tsv
37 -* **Neuroimaging Data**: .nii, .dcm
38 -* **Genomic Data**: .fasta, .vcf
39 -* **Clinical Metadata**: .json, .xml
40 -
41 -**Mandatory Fields for Integration**
42 -
43 -|=**Field Name**|=**Description**
44 -|**Subject ID**|Unique patient identifier
45 -|**Diagnosis**|Standardized disease classification
46 -|**Biomarkers**|CSF, plasma, or imaging biomarkers
47 -|**Genetic Data**|Whole-genome or exome sequencing
48 -|**Neuroimaging Metadata**|MRI/PET acquisition parameters
49 -
50 -=== **Upload Data to Neurodiagnoses** ===
51 -
52 -**Option 1:** Upload to **EBRAINS Bucket** → [[Neurodiagnoses Data Storage>>url:https://wiki.ebrains.eu/bin/view/Collabs/neurodiagnoses/Bucket]]
53 -**Option 2:** Contribute via **GitHub Repository** → [[GitHub Data Repository>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/tree/main/data]]
54 -
55 -**For large datasets, please contact project administrators before uploading.**
56 -
57 -=== **Integrate Data into AI Models** ===
58 -
59 -Use **Jupyter Notebooks** on EBRAINS for **data preprocessing.**
60 -Standardize data using **harmonization tools.**
61 -Train AI models with **newly integrated datasets.**
62 -
63 -**Reference:** [[Data Processing Guide>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/data_processing.md]]
64 -
65 65  ----
66 66  
67 -== **AI-Powered Annotation & Machine Learning Models** ==
28 +=== **2. AI-Based Analysis** ===
68 68  
69 -Neurodiagnoses applies **advanced machine learning models** to classify CNS diseases, extract features from **biomarkers and neuroimaging**, and provide **AI-powered annotation.**
30 +==== **Model Development** ====
70 70  
71 -=== **AI Model Categories** ===
32 +* **Embedding Models**: Use pre-trained models like BioBERT or BioLORD for text data.
33 +* **Classification Models**:
34 +** Algorithms: Random Forest, Support Vector Machines (SVM), or neural networks.
35 +** Purpose: Predict the likelihood of specific neurological conditions based on input data.
72 72  
73 -|=**Model Type**|=**Function**|=**Example Algorithms**
74 -|**Probabilistic Diagnosis**|Assigns probability scores to multiple CNS disorders.|Random Forest, XGBoost, Bayesian Networks
75 -|**Tridimensional Diagnosis**|Classifies disorders based on Etiology, Biomarkers, and Neuroanatomical Correlations.|CNNs, Transformers, Autoencoders
76 -|**Biomarker Prediction**|Predicts missing biomarker values using regression.|KNN Imputation, Bayesian Estimation
77 -|**Neuroimaging Feature Extraction**|Extracts patterns from MRI, PET, EEG.|CNNs, Graph Neural Networks
78 -|**Clinical Decision Support**|Generates AI-driven diagnostic reports.|SHAP Explainability Tools
37 +==== **Dimensionality Reduction and Interpretability** ====
79 79  
80 -**Reference:** [[AI Model Documentation>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/models.md]]
39 +* Leverage [[DEIBO>>https://drive.ebrains.eu/f/8d7157708cde4b258db0/]] (Data-driven Embedding Interpretation Based on Ontologies) to connect model dimensions to ontology concepts.
40 +* Evaluate interpretability using metrics like the Area Under the Interpretability Curve (AUIC).
81 81  
82 82  ----
83 83  
84 -== **Clinical Decision Support & Tridimensional Diagnostic Framework** ==
44 +=== **3. Diagnostic Framework** ===
85 85  
86 -Neurodiagnoses generates **structured AI reports** for clinicians, combining:
46 +==== **Axes of Diagnosis** ====
87 87  
88 -**Probabilistic Diagnosis:** AI-generated ranking of potential diagnoses.
89 -**Tridimensional Classification:** Standardized diagnostic reports based on:
48 +The framework organizes diagnostic data into three axes:
90 90  
91 -1. **Axis 1:** **Etiology** Genetic, Autoimmune, Prion, Toxic, Vascular.
92 -1. **Axis 2:** **Molecular Markers** → CSF, Neuroinflammation, EEG biomarkers.
93 -1. **Axis 3:** **Neuroanatomoclinical Correlations** → MRI atrophy, PET.
50 +1. **Etiology**: Genetic and environmental risk factors.
51 +1. **Molecular Markers**: Biomarkers such as amyloid-beta, tau, and alpha-synuclein.
52 +1. **Neuroanatomical Correlations**: Results from neuroimaging (e.g., MRI, PET).
94 94  
95 -**Reference:** [[Tridimensional Classification Guide>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/classification.md]]
54 +==== **Recommendation System** ====
96 96  
56 +* Suggests additional tests or biomarkers if gaps are detected in the data.
57 +* Prioritizes tests based on clinical impact and cost-effectiveness.
58 +
97 97  ----
98 98  
99 -== **Data Security, Compliance & Federated Learning** ==
61 +=== **4. Computational Workflow** ===
100 100  
101 -✔ **Privacy-Preserving AI**: Implements **Federated Learning**, ensuring that patient data **never leaves** local institutions.
102 -✔ **Secure Data Access**: Data remains **stored in EBRAINS MIP servers** using **differential privacy techniques.**
103 -✔ **Ethical & GDPR Compliance**: Data-sharing agreements **must be signed** before use.
63 +1. **Data Loading**: Import data from storage (Drive or Bucket).
64 +1. **Feature Engineering**: Generate derived features from the raw data.
65 +1. **Model Training**:
66 +1*. Split data into training, validation, and test sets.
67 +1*. Train models with cross-validation to ensure robustness.
68 +1. **Evaluation**:
69 +1*. Metrics: Accuracy, F1-Score, AUIC for interpretability.
70 +1*. Compare against baseline models and domain benchmarks.
104 104  
105 -**Reference:** [[Data Protection & Federated Learning>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/security.md]]
106 -
107 107  ----
108 108  
109 -== **Data Processing & Integration with Clinica.Run** ==
74 +=== **5. Validation** ===
110 110  
111 -Neurodiagnoses now supports **Clinica.Run**, an **open-source neuroimaging platform** for **multimodal data processing.**
76 +==== **Internal Validation** ====
112 112  
113 -=== **How It Works** ===
78 +* Test the system using simulated datasets and known clinical cases.
79 +* Fine-tune models based on validation results.
114 114  
115 -✔ **Neuroimaging Preprocessing**: MRI, PET, EEG data is preprocessed using **Clinica.Run pipelines.**
116 -✔ **Automated Biomarker Extraction**: Extracts volumetric, metabolic, and functional biomarkers.
117 -✔ **Data Security & Compliance**: Clinica.Run is **GDPR & HIPAA-compliant.**
81 +==== **External Validation** ====
118 118  
119 -=== **Implementation Steps** ===
83 +* Collaborate with research institutions and hospitals to test the system in real-world settings.
84 +* Use anonymized patient data to ensure privacy compliance.
120 120  
121 -1. Install **Clinica.Run** dependencies.
122 -1. Configure **Clinica.Run pipeline** in clinica_run_config.json.
123 -1. Run **biomarker extraction pipelines** for AI-based diagnostics.
124 -
125 -**Reference:** [[Clinica.Run Documentation>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/clinica_run.md]]
126 -
127 127  ----
128 128  
129 -== **Collaborative Development & Research** ==
88 +=== **6. Collaborative Development** ===
130 130  
131 -**We Use GitHub to Develop AI Models & Store Research Data**
90 +The project is open to contributions from researchers, clinicians, and developers. Key tools include:
132 132  
133 -* **GitHub Repository:** AI model training scripts.
134 -* **GitHub Issues:** Tracks ongoing research questions.
135 -* **GitHub Wiki:** Project documentation & user guides.
92 +* **Jupyter Notebooks**: For data analysis and pipeline development.
93 +** Example: [[probabilistic imputation>>https://drive.ebrains.eu/f/4f69ab52f7734ef48217/]]
94 +* **Wiki Pages**: For documenting methods and results.
95 +* **Drive and Bucket**: For sharing code, data, and outputs.
96 +* **Collaboration with related projects: **For instance: [[//Beyond the hype: AI in dementia – from early risk detection to disease treatment//>>https://www.lethe-project.eu/beyond-the-hype-ai-in-dementia-from-early-risk-detection-to-disease-treatment/]]
136 136  
137 -**We Use EBRAINS for Data & Collaboration**
138 -
139 -* **EBRAINS Buckets:** Large-scale neuroimaging and biomarker storage.
140 -* **EBRAINS Jupyter Notebooks:** Cloud-based AI model execution.
141 -* **EBRAINS Wiki:** Research documentation and updates.
142 -
143 -**Join the Project Forum:** [[GitHub Discussions>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/discussions]]
144 -
145 145  ----
146 146  
147 -**For Additional Documentation:**
100 +=== **7. Tools and Technologies** ===
148 148  
149 -* **GitHub Repository:** [[Neurodiagnoses AI Models>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses]]
150 -* **EBRAINS Wiki:** [[Neurodiagnoses Research Collaboration>>url:https://wiki.ebrains.eu/bin/view/Collabs/neurodiagnoses/]]
151 -
152 -----
153 -
154 -**Neurodiagnoses is Open for Contributions – Join Us Today!**
102 +* **Programming Languages**: Python for AI and data processing.
103 +* **Frameworks**:
104 +** TensorFlow and PyTorch for machine learning.
105 +** Flask or FastAPI for backend services.
106 +* **Visualization**: Plotly and Matplotlib for interactive and static visualizations.
107 +* **EBRAINS Services**:
108 +** Collaboratory Lab for running Notebooks.
109 +** Buckets for storing large datasets.