Warning:  Due to planned infrastructure maintenance, the EBRAINS Wiki and EBRAINS Support system will be unavailable for up to three days starting Monday, 14 July. During this period, both services will be inaccessible, and any emails sent to the support address will not be received.

Attention: We are currently experiencing some issues with the EBRAINS Drive. Please bear with us as we fix this issue. We apologise for any inconvenience caused.


Changes for page Methodology

Last modified by manuelmenendez on 2025/03/14 08:31

From version 4.1
edited by manuelmenendez
on 2025/01/27 23:46
Change comment: There is no comment for this version
To version 17.1
edited by manuelmenendez
on 2025/02/09 13:01
Change comment: There is no comment for this version

Summary

Details

Page properties
Content
... ... @@ -1,107 +1,154 @@
1 -=== **Overview** ===
1 +== **Overview** ==
2 2  
3 -This section describes the step-by-step process used in the **Neurodiagnoses** project to develop a novel diagnostic framework for neurological diseases. The methodology integrates artificial intelligence (AI), biomedical ontologies, and computational neuroscience to create a structured, interpretable, and scalable diagnostic system.
3 +Neurodiagnoses develops a **tridimensional diagnostic framework** for **CNS diseases**, incorporating **AI-powered annotation tools** to improve **interpretability, standardization, and clinical utility.**
4 4  
5 +This methodology integrates **multi-modal data**, including:
6 +**Genetic data** (whole-genome sequencing, polygenic risk scores).
7 +**Neuroimaging** (MRI, PET, EEG, MEG).
8 +**Neurophysiological data** (EEG-based biomarkers, sleep actigraphy).
9 +**CSF & Blood Biomarkers** (Amyloid-beta, Tau, Neurofilament Light).
10 +
11 +By applying **machine learning models**, Neurodiagnoses generates **structured, explainable diagnostic outputs** to assist **clinical decision-making** and **biomarker-driven patient stratification.**
12 +
5 5  ----
6 6  
7 -=== **1. Data Integration** ===
15 +== **Data Integration & External Databases** ==
8 8  
9 -==== **Data Sources** ====
17 +=== **How to Use External Databases in Neurodiagnoses** ===
10 10  
11 -* **Biomedical Ontologies**:
12 -** Human Phenotype Ontology (HPO) for phenotypic abnormalities.
13 -** Gene Ontology (GO) for molecular and cellular processes.
14 -* **Neuroimaging Datasets**:
15 -** Example: Alzheimer’s Disease Neuroimaging Initiative (ADNI), OpenNeuro.
16 -* **Clinical and Biomarker Data**:
17 -** Anonymized clinical reports, molecular biomarkers, and test results.
19 +Neurodiagnoses integrates data from multiple **biomedical and neurological research databases**. Researchers can follow these steps to **access, prepare, and integrate** data into the Neurodiagnoses framework.
18 18  
19 -==== **Data Preprocessing** ====
21 +**Potential Data Sources**
22 +**Reference:** [[List of Potential Databases>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/data/sources/list_of_potential_databases]]
20 20  
21 -1. **Standardization**: Ensure all data sources are normalized to a common format.
22 -1. **Feature Selection**: Identify relevant features for diagnosis (e.g., biomarkers, imaging scores).
23 -1. **Data Cleaning**: Handle missing values and remove duplicates.
24 +=== **Register for Access** ===
24 24  
25 -----
26 +Each **external database** requires **individual registration** and approval.
27 +✔️ Follow the official **data access guidelines** of each provider.
28 +✔️ Ensure compliance with **ethical approvals** and **data-sharing agreements (DUAs).**
26 26  
27 -=== **2. AI-Based Analysis** ===
30 +=== **Download & Prepare Data** ===
28 28  
29 -==== **Model Development** ====
32 +Once access is granted, download datasets **following compliance guidelines** and **format requirements** for integration.
30 30  
31 -* **Embedding Models**: Use pre-trained models like BioBERT or BioLORD for text data.
32 -* **Classification Models**:
33 -** Algorithms: Random Forest, Support Vector Machines (SVM), or neural networks.
34 -** Purpose: Predict the likelihood of specific neurological conditions based on input data.
34 +**Supported File Formats**
35 35  
36 -==== **Dimensionality Reduction and Interpretability** ====
36 +* **Tabular Data**: .csv, .tsv
37 +* **Neuroimaging Data**: .nii, .dcm
38 +* **Genomic Data**: .fasta, .vcf
39 +* **Clinical Metadata**: .json, .xml
37 37  
38 -* Leverage [[DEIBO>>https://drive.ebrains.eu/f/8d7157708cde4b258db0/]] (Data-driven Embedding Interpretation Based on Ontologies) to connect model dimensions to ontology concepts.
39 -* Evaluate interpretability using metrics like the Area Under the Interpretability Curve (AUIC).
41 +**Mandatory Fields for Integration**
40 40  
43 +|=**Field Name**|=**Description**
44 +|**Subject ID**|Unique patient identifier
45 +|**Diagnosis**|Standardized disease classification
46 +|**Biomarkers**|CSF, plasma, or imaging biomarkers
47 +|**Genetic Data**|Whole-genome or exome sequencing
48 +|**Neuroimaging Metadata**|MRI/PET acquisition parameters
49 +
50 +=== **Upload Data to Neurodiagnoses** ===
51 +
52 +**Option 1:** Upload to **EBRAINS Bucket** → [[Neurodiagnoses Data Storage>>url:https://wiki.ebrains.eu/bin/view/Collabs/neurodiagnoses/Bucket]]
53 +**Option 2:** Contribute via **GitHub Repository** → [[GitHub Data Repository>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/tree/main/data]]
54 +
55 +**For large datasets, please contact project administrators before uploading.**
56 +
57 +=== **Integrate Data into AI Models** ===
58 +
59 +Use **Jupyter Notebooks** on EBRAINS for **data preprocessing.**
60 +Standardize data using **harmonization tools.**
61 +Train AI models with **newly integrated datasets.**
62 +
63 +**Reference:** [[Data Processing Guide>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/data_processing.md]]
64 +
41 41  ----
42 42  
43 -=== **3. Diagnostic Framework** ===
67 +== **AI-Powered Annotation & Machine Learning Models** ==
44 44  
45 -==== **Axes of Diagnosis** ====
69 +Neurodiagnoses applies **advanced machine learning models** to classify CNS diseases, extract features from **biomarkers and neuroimaging**, and provide **AI-powered annotation.**
46 46  
47 -The framework organizes diagnostic data into three axes:
71 +=== **AI Model Categories** ===
48 48  
49 -1. **Etiology**: Genetic and environmental risk factors.
50 -1. **Molecular Markers**: Biomarkers such as amyloid-beta, tau, and alpha-synuclein.
51 -1. **Neuroanatomical Correlations**: Results from neuroimaging (e.g., MRI, PET).
73 +|=**Model Type**|=**Function**|=**Example Algorithms**
74 +|**Probabilistic Diagnosis**|Assigns probability scores to multiple CNS disorders.|Random Forest, XGBoost, Bayesian Networks
75 +|**Tridimensional Diagnosis**|Classifies disorders based on Etiology, Biomarkers, and Neuroanatomical Correlations.|CNNs, Transformers, Autoencoders
76 +|**Biomarker Prediction**|Predicts missing biomarker values using regression.|KNN Imputation, Bayesian Estimation
77 +|**Neuroimaging Feature Extraction**|Extracts patterns from MRI, PET, EEG.|CNNs, Graph Neural Networks
78 +|**Clinical Decision Support**|Generates AI-driven diagnostic reports.|SHAP Explainability Tools
52 52  
53 -==== **Recommendation System** ====
80 +**Reference:** [[AI Model Documentation>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/models.md]]
54 54  
55 -* Suggests additional tests or biomarkers if gaps are detected in the data.
56 -* Prioritizes tests based on clinical impact and cost-effectiveness.
82 +----
57 57  
84 +== **Clinical Decision Support & Tridimensional Diagnostic Framework** ==
85 +
86 +Neurodiagnoses generates **structured AI reports** for clinicians, combining:
87 +
88 +**Probabilistic Diagnosis:** AI-generated ranking of potential diagnoses.
89 +**Tridimensional Classification:** Standardized diagnostic reports based on:
90 +
91 +1. **Axis 1:** **Etiology** → Genetic, Autoimmune, Prion, Toxic, Vascular.
92 +1. **Axis 2:** **Molecular Markers** → CSF, Neuroinflammation, EEG biomarkers.
93 +1. **Axis 3:** **Neuroanatomoclinical Correlations** → MRI atrophy, PET.
94 +
95 +**Reference:** [[Tridimensional Classification Guide>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/classification.md]]
96 +
58 58  ----
59 59  
60 -=== **4. Computational Workflow** ===
99 +== **Data Security, Compliance & Federated Learning** ==
61 61  
62 -1. **Data Loading**: Import data from storage (Drive or Bucket).
63 -1. **Feature Engineering**: Generate derived features from the raw data.
64 -1. **Model Training**:
65 -1*. Split data into training, validation, and test sets.
66 -1*. Train models with cross-validation to ensure robustness.
67 -1. **Evaluation**:
68 -1*. Metrics: Accuracy, F1-Score, AUIC for interpretability.
69 -1*. Compare against baseline models and domain benchmarks.
101 +✔ **Privacy-Preserving AI**: Implements **Federated Learning**, ensuring that patient data **never leaves** local institutions.
102 +✔ **Secure Data Access**: Data remains **stored in EBRAINS MIP servers** using **differential privacy techniques.**
103 +✔ **Ethical & GDPR Compliance**: Data-sharing agreements **must be signed** before use.
70 70  
105 +**Reference:** [[Data Protection & Federated Learning>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/security.md]]
106 +
71 71  ----
72 72  
73 -=== **5. Validation** ===
109 +== **Data Processing & Integration with Clinica.Run** ==
74 74  
75 -==== **Internal Validation** ====
111 +Neurodiagnoses now supports **Clinica.Run**, an **open-source neuroimaging platform** for **multimodal data processing.**
76 76  
77 -* Test the system using simulated datasets and known clinical cases.
78 -* Fine-tune models based on validation results.
113 +=== **How It Works** ===
79 79  
80 -==== **External Validation** ====
115 +✔ **Neuroimaging Preprocessing**: MRI, PET, EEG data is preprocessed using **Clinica.Run pipelines.**
116 +✔ **Automated Biomarker Extraction**: Extracts volumetric, metabolic, and functional biomarkers.
117 +✔ **Data Security & Compliance**: Clinica.Run is **GDPR & HIPAA-compliant.**
81 81  
82 -* Collaborate with research institutions and hospitals to test the system in real-world settings.
83 -* Use anonymized patient data to ensure privacy compliance.
119 +=== **Implementation Steps** ===
84 84  
121 +1. Install **Clinica.Run** dependencies.
122 +1. Configure **Clinica.Run pipeline** in clinica_run_config.json.
123 +1. Run **biomarker extraction pipelines** for AI-based diagnostics.
124 +
125 +**Reference:** [[Clinica.Run Documentation>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/clinica_run.md]]
126 +
85 85  ----
86 86  
87 -=== **6. Collaborative Development** ===
129 +== **Collaborative Development & Research** ==
88 88  
89 -The project is open to contributions from researchers, clinicians, and developers. Key tools include:
131 +**We Use GitHub to Develop AI Models & Store Research Data**
90 90  
91 -* **Jupyter Notebooks**: For data analysis and pipeline development.
92 -** Example: [[probabilistic imputation>>https://drive.ebrains.eu/f/4f69ab52f7734ef48217/]]
93 -* **Wiki Pages**: For documenting methods and results.
94 -* **Drive and Bucket**: For sharing code, data, and outputs.
133 +* **GitHub Repository:** AI model training scripts.
134 +* **GitHub Issues:** Tracks ongoing research questions.
135 +* **GitHub Wiki:** Project documentation & user guides.
95 95  
137 +**We Use EBRAINS for Data & Collaboration**
138 +
139 +* **EBRAINS Buckets:** Large-scale neuroimaging and biomarker storage.
140 +* **EBRAINS Jupyter Notebooks:** Cloud-based AI model execution.
141 +* **EBRAINS Wiki:** Research documentation and updates.
142 +
143 +**Join the Project Forum:** [[GitHub Discussions>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/discussions]]
144 +
96 96  ----
97 97  
98 -=== **7. Tools and Technologies** ===
147 +**For Additional Documentation:**
99 99  
100 -* **Programming Languages**: Python for AI and data processing.
101 -* **Frameworks**:
102 -** TensorFlow and PyTorch for machine learning.
103 -** Flask or FastAPI for backend services.
104 -* **Visualization**: Plotly and Matplotlib for interactive and static visualizations.
105 -* **EBRAINS Services**:
106 -** Collaboratory Lab for running Notebooks.
107 -** Buckets for storing large datasets.
149 +* **GitHub Repository:** [[Neurodiagnoses AI Models>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses]]
150 +* **EBRAINS Wiki:** [[Neurodiagnoses Research Collaboration>>url:https://wiki.ebrains.eu/bin/view/Collabs/neurodiagnoses/]]
151 +
152 +----
153 +
154 +**Neurodiagnoses is Open for Contributions – Join Us Today!**