Changes for page Methodology

Last modified by manuelmenendez on 2025/03/14 08:31

From version 17.1
edited by manuelmenendez
on 2025/02/09 13:01
Change comment: There is no comment for this version
To version 4.3
edited by manuelmenendez
on 2025/01/29 19:11
Change comment: There is no comment for this version

Summary

Details

Page properties
Content
... ... @@ -1,154 +1,109 @@
1 -== **Overview** ==
1 +=== **Overview** ===
2 2  
3 -Neurodiagnoses develops a **tridimensional diagnostic framework** for **CNS diseases**, incorporating **AI-powered annotation tools** to improve **interpretability, standardization, and clinical utility.**
3 +This section describes the step-by-step process used in the **Neurodiagnoses** project to develop a novel diagnostic framework for neurological diseases. The methodology integrates artificial intelligence (AI), biomedical ontologies, and computational neuroscience to create a structured, interpretable, and scalable diagnostic system.
4 4  
5 -This methodology integrates **multi-modal data**, including:
6 -**Genetic data** (whole-genome sequencing, polygenic risk scores).
7 -**Neuroimaging** (MRI, PET, EEG, MEG).
8 -**Neurophysiological data** (EEG-based biomarkers, sleep actigraphy).
9 -**CSF & Blood Biomarkers** (Amyloid-beta, Tau, Neurofilament Light).
10 -
11 -By applying **machine learning models**, Neurodiagnoses generates **structured, explainable diagnostic outputs** to assist **clinical decision-making** and **biomarker-driven patient stratification.**
12 -
13 13  ----
14 14  
15 -== **Data Integration & External Databases** ==
7 +=== **1. Data Integration** ===
16 16  
17 -=== **How to Use External Databases in Neurodiagnoses** ===
9 +==== **Data Sources** ====
18 18  
19 -Neurodiagnoses integrates data from multiple **biomedical and neurological research databases**. Researchers can follow these steps to **access, prepare, and integrate** data into the Neurodiagnoses framework.
11 +* **Biomedical Ontologies**:
12 +** Human Phenotype Ontology (HPO) for phenotypic abnormalities.
13 +** Gene Ontology (GO) for molecular and cellular processes.
14 +* **Neuroimaging Datasets**:
15 +** Example: Alzheimer’s Disease Neuroimaging Initiative (ADNI), OpenNeuro.
16 +* **Clinical and Biomarker Data**:
17 +** Anonymized clinical reports, molecular biomarkers, and test results.
20 20  
21 -**Potential Data Sources**
22 -**Reference:** [[List of Potential Databases>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/data/sources/list_of_potential_databases]]
23 23  
24 -=== **Register for Access** ===
20 +==== **Data Preprocessing** ====
25 25  
26 -Each **external database** requires **individual registration** and approval.
27 -✔️ Follow the official **data access guidelines** of each provider.
28 -✔️ Ensure compliance with **ethical approvals** and **data-sharing agreements (DUAs).**
22 +1. **Standardization**: Ensure all data sources are normalized to a common format.
23 +1. **Feature Selection**: Identify relevant features for diagnosis (e.g., biomarkers, imaging scores).
24 +1. **Data Cleaning**: Handle missing values and remove duplicates.
29 29  
30 -=== **Download & Prepare Data** ===
31 -
32 -Once access is granted, download datasets **following compliance guidelines** and **format requirements** for integration.
33 -
34 -**Supported File Formats**
35 -
36 -* **Tabular Data**: .csv, .tsv
37 -* **Neuroimaging Data**: .nii, .dcm
38 -* **Genomic Data**: .fasta, .vcf
39 -* **Clinical Metadata**: .json, .xml
40 -
41 -**Mandatory Fields for Integration**
42 -
43 -|=**Field Name**|=**Description**
44 -|**Subject ID**|Unique patient identifier
45 -|**Diagnosis**|Standardized disease classification
46 -|**Biomarkers**|CSF, plasma, or imaging biomarkers
47 -|**Genetic Data**|Whole-genome or exome sequencing
48 -|**Neuroimaging Metadata**|MRI/PET acquisition parameters
49 -
50 -=== **Upload Data to Neurodiagnoses** ===
51 -
52 -**Option 1:** Upload to **EBRAINS Bucket** → [[Neurodiagnoses Data Storage>>url:https://wiki.ebrains.eu/bin/view/Collabs/neurodiagnoses/Bucket]]
53 -**Option 2:** Contribute via **GitHub Repository** → [[GitHub Data Repository>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/tree/main/data]]
54 -
55 -**For large datasets, please contact project administrators before uploading.**
56 -
57 -=== **Integrate Data into AI Models** ===
58 -
59 -Use **Jupyter Notebooks** on EBRAINS for **data preprocessing.**
60 -Standardize data using **harmonization tools.**
61 -Train AI models with **newly integrated datasets.**
62 -
63 -**Reference:** [[Data Processing Guide>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/data_processing.md]]
64 -
65 65  ----
66 66  
67 -== **AI-Powered Annotation & Machine Learning Models** ==
28 +=== **2. AI-Based Analysis** ===
68 68  
69 -Neurodiagnoses applies **advanced machine learning models** to classify CNS diseases, extract features from **biomarkers and neuroimaging**, and provide **AI-powered annotation.**
30 +==== **Model Development** ====
70 70  
71 -=== **AI Model Categories** ===
32 +* **Embedding Models**: Use pre-trained models like BioBERT or BioLORD for text data.
33 +* **Classification Models**:
34 +** Algorithms: Random Forest, Support Vector Machines (SVM), or neural networks.
35 +** Purpose: Predict the likelihood of specific neurological conditions based on input data.
72 72  
73 -|=**Model Type**|=**Function**|=**Example Algorithms**
74 -|**Probabilistic Diagnosis**|Assigns probability scores to multiple CNS disorders.|Random Forest, XGBoost, Bayesian Networks
75 -|**Tridimensional Diagnosis**|Classifies disorders based on Etiology, Biomarkers, and Neuroanatomical Correlations.|CNNs, Transformers, Autoencoders
76 -|**Biomarker Prediction**|Predicts missing biomarker values using regression.|KNN Imputation, Bayesian Estimation
77 -|**Neuroimaging Feature Extraction**|Extracts patterns from MRI, PET, EEG.|CNNs, Graph Neural Networks
78 -|**Clinical Decision Support**|Generates AI-driven diagnostic reports.|SHAP Explainability Tools
37 +==== **Dimensionality Reduction and Interpretability** ====
79 79  
80 -**Reference:** [[AI Model Documentation>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/models.md]]
39 +* Leverage [[DEIBO>>https://drive.ebrains.eu/f/8d7157708cde4b258db0/]] (Data-driven Embedding Interpretation Based on Ontologies) to connect model dimensions to ontology concepts.
40 +* Evaluate interpretability using metrics like the Area Under the Interpretability Curve (AUIC).
81 81  
82 82  ----
83 83  
84 -== **Clinical Decision Support & Tridimensional Diagnostic Framework** ==
44 +=== **3. Diagnostic Framework** ===
85 85  
86 -Neurodiagnoses generates **structured AI reports** for clinicians, combining:
46 +==== **Axes of Diagnosis** ====
87 87  
88 -**Probabilistic Diagnosis:** AI-generated ranking of potential diagnoses.
89 -**Tridimensional Classification:** Standardized diagnostic reports based on:
48 +The framework organizes diagnostic data into three axes:
90 90  
91 -1. **Axis 1:** **Etiology** Genetic, Autoimmune, Prion, Toxic, Vascular.
92 -1. **Axis 2:** **Molecular Markers** → CSF, Neuroinflammation, EEG biomarkers.
93 -1. **Axis 3:** **Neuroanatomoclinical Correlations** → MRI atrophy, PET.
50 +1. **Etiology**: Genetic and environmental risk factors.
51 +1. **Molecular Markers**: Biomarkers such as amyloid-beta, tau, and alpha-synuclein.
52 +1. **Neuroanatomical Correlations**: Results from neuroimaging (e.g., MRI, PET).
94 94  
95 -**Reference:** [[Tridimensional Classification Guide>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/classification.md]]
54 +==== **Recommendation System** ====
96 96  
56 +* Suggests additional tests or biomarkers if gaps are detected in the data.
57 +* Prioritizes tests based on clinical impact and cost-effectiveness.
58 +
97 97  ----
98 98  
99 -== **Data Security, Compliance & Federated Learning** ==
61 +=== **4. Computational Workflow** ===
100 100  
101 -✔ **Privacy-Preserving AI**: Implements **Federated Learning**, ensuring that patient data **never leaves** local institutions.
102 -✔ **Secure Data Access**: Data remains **stored in EBRAINS MIP servers** using **differential privacy techniques.**
103 -✔ **Ethical & GDPR Compliance**: Data-sharing agreements **must be signed** before use.
63 +1. **Data Loading**: Import data from storage (Drive or Bucket).
64 +1. **Feature Engineering**: Generate derived features from the raw data.
65 +1. **Model Training**:
66 +1*. Split data into training, validation, and test sets.
67 +1*. Train models with cross-validation to ensure robustness.
68 +1. **Evaluation**:
69 +1*. Metrics: Accuracy, F1-Score, AUIC for interpretability.
70 +1*. Compare against baseline models and domain benchmarks.
104 104  
105 -**Reference:** [[Data Protection & Federated Learning>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/security.md]]
106 -
107 107  ----
108 108  
109 -== **Data Processing & Integration with Clinica.Run** ==
74 +=== **5. Validation** ===
110 110  
111 -Neurodiagnoses now supports **Clinica.Run**, an **open-source neuroimaging platform** for **multimodal data processing.**
76 +==== **Internal Validation** ====
112 112  
113 -=== **How It Works** ===
78 +* Test the system using simulated datasets and known clinical cases.
79 +* Fine-tune models based on validation results.
114 114  
115 -✔ **Neuroimaging Preprocessing**: MRI, PET, EEG data is preprocessed using **Clinica.Run pipelines.**
116 -✔ **Automated Biomarker Extraction**: Extracts volumetric, metabolic, and functional biomarkers.
117 -✔ **Data Security & Compliance**: Clinica.Run is **GDPR & HIPAA-compliant.**
81 +==== **External Validation** ====
118 118  
119 -=== **Implementation Steps** ===
83 +* Collaborate with research institutions and hospitals to test the system in real-world settings.
84 +* Use anonymized patient data to ensure privacy compliance.
120 120  
121 -1. Install **Clinica.Run** dependencies.
122 -1. Configure **Clinica.Run pipeline** in clinica_run_config.json.
123 -1. Run **biomarker extraction pipelines** for AI-based diagnostics.
124 -
125 -**Reference:** [[Clinica.Run Documentation>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/clinica_run.md]]
126 -
127 127  ----
128 128  
129 -== **Collaborative Development & Research** ==
88 +=== **6. Collaborative Development** ===
130 130  
131 -**We Use GitHub to Develop AI Models & Store Research Data**
90 +The project is open to contributions from researchers, clinicians, and developers. Key tools include:
132 132  
133 -* **GitHub Repository:** AI model training scripts.
134 -* **GitHub Issues:** Tracks ongoing research questions.
135 -* **GitHub Wiki:** Project documentation & user guides.
92 +* **Jupyter Notebooks**: For data analysis and pipeline development.
93 +** Example: [[probabilistic imputation>>https://drive.ebrains.eu/f/4f69ab52f7734ef48217/]]
94 +* **Wiki Pages**: For documenting methods and results.
95 +* **Drive and Bucket**: For sharing code, data, and outputs.
96 +* **Collaboration with related projects: **For instance: [[//Beyond the hype: AI in dementia – from early risk detection to disease treatment//>>https://www.lethe-project.eu/beyond-the-hype-ai-in-dementia-from-early-risk-detection-to-disease-treatment/]]
136 136  
137 -**We Use EBRAINS for Data & Collaboration**
138 -
139 -* **EBRAINS Buckets:** Large-scale neuroimaging and biomarker storage.
140 -* **EBRAINS Jupyter Notebooks:** Cloud-based AI model execution.
141 -* **EBRAINS Wiki:** Research documentation and updates.
142 -
143 -**Join the Project Forum:** [[GitHub Discussions>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/discussions]]
144 -
145 145  ----
146 146  
147 -**For Additional Documentation:**
100 +=== **7. Tools and Technologies** ===
148 148  
149 -* **GitHub Repository:** [[Neurodiagnoses AI Models>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses]]
150 -* **EBRAINS Wiki:** [[Neurodiagnoses Research Collaboration>>url:https://wiki.ebrains.eu/bin/view/Collabs/neurodiagnoses/]]
151 -
152 -----
153 -
154 -**Neurodiagnoses is Open for Contributions – Join Us Today!**
102 +* **Programming Languages**: Python for AI and data processing.
103 +* **Frameworks**:
104 +** TensorFlow and PyTorch for machine learning.
105 +** Flask or FastAPI for backend services.
106 +* **Visualization**: Plotly and Matplotlib for interactive and static visualizations.
107 +* **EBRAINS Services**:
108 +** Collaboratory Lab for running Notebooks.
109 +** Buckets for storing large datasets.