Changes for page Methodology
Last modified by manuelmenendez on 2025/03/14 08:31
From version 17.1
edited by manuelmenendez
on 2025/02/09 13:01
on 2025/02/09 13:01
Change comment:
There is no comment for this version
To version 4.3
edited by manuelmenendez
on 2025/01/29 19:11
on 2025/01/29 19:11
Change comment:
There is no comment for this version
Summary
-
Page properties (1 modified, 0 added, 0 removed)
Details
- Page properties
-
- Content
-
... ... @@ -1,154 +1,109 @@ 1 -== **Overview** == 1 +=== **Overview** === 2 2 3 -Neurodiagnoses develop sa**tridimensional diagnostic framework**for**CNSdiseases**,incorporating**AI-powered annotationtools**toimprove**interpretability,standardization,andclinicalutility.**3 +This section describes the step-by-step process used in the **Neurodiagnoses** project to develop a novel diagnostic framework for neurological diseases. The methodology integrates artificial intelligence (AI), biomedical ontologies, and computational neuroscience to create a structured, interpretable, and scalable diagnostic system. 4 4 5 -This methodology integrates **multi-modal data**, including: 6 -**Genetic data** (whole-genome sequencing, polygenic risk scores). 7 -**Neuroimaging** (MRI, PET, EEG, MEG). 8 -**Neurophysiological data** (EEG-based biomarkers, sleep actigraphy). 9 -**CSF & Blood Biomarkers** (Amyloid-beta, Tau, Neurofilament Light). 10 - 11 -By applying **machine learning models**, Neurodiagnoses generates **structured, explainable diagnostic outputs** to assist **clinical decision-making** and **biomarker-driven patient stratification.** 12 - 13 13 ---- 14 14 15 -== **Data Integration & External Databases** ==7 +=== **1. Data Integration** === 16 16 17 -=== ** How to Use ExternalDatabasesin Neurodiagnoses** ===9 +==== **Data Sources** ==== 18 18 19 -Neurodiagnoses integrates data from multiple **biomedical and neurological research databases**. Researchers can follow these steps to **access, prepare, and integrate** data into the Neurodiagnoses framework. 11 +* **Biomedical Ontologies**: 12 +** Human Phenotype Ontology (HPO) for phenotypic abnormalities. 13 +** Gene Ontology (GO) for molecular and cellular processes. 14 +* **Neuroimaging Datasets**: 15 +** Example: Alzheimer’s Disease Neuroimaging Initiative (ADNI), OpenNeuro. 16 +* **Clinical and Biomarker Data**: 17 +** Anonymized clinical reports, molecular biomarkers, and test results. 20 20 21 -**Potential Data Sources** 22 -**Reference:** [[List of Potential Databases>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/data/sources/list_of_potential_databases]] 23 23 24 -=== ** Registerfor Access** ===20 +==== **Data Preprocessing** ==== 25 25 26 - Each**externaldatabase**requires**individual registration**andapproval.27 - ✔️Followtheofficial**data accessguidelines**ofeachprovider.28 - ✔️Ensurecompliance with**ethicalapprovals**and**data-sharing agreements(DUAs).**22 +1. **Standardization**: Ensure all data sources are normalized to a common format. 23 +1. **Feature Selection**: Identify relevant features for diagnosis (e.g., biomarkers, imaging scores). 24 +1. **Data Cleaning**: Handle missing values and remove duplicates. 29 29 30 -=== **Download & Prepare Data** === 31 - 32 -Once access is granted, download datasets **following compliance guidelines** and **format requirements** for integration. 33 - 34 -**Supported File Formats** 35 - 36 -* **Tabular Data**: .csv, .tsv 37 -* **Neuroimaging Data**: .nii, .dcm 38 -* **Genomic Data**: .fasta, .vcf 39 -* **Clinical Metadata**: .json, .xml 40 - 41 -**Mandatory Fields for Integration** 42 - 43 -|=**Field Name**|=**Description** 44 -|**Subject ID**|Unique patient identifier 45 -|**Diagnosis**|Standardized disease classification 46 -|**Biomarkers**|CSF, plasma, or imaging biomarkers 47 -|**Genetic Data**|Whole-genome or exome sequencing 48 -|**Neuroimaging Metadata**|MRI/PET acquisition parameters 49 - 50 -=== **Upload Data to Neurodiagnoses** === 51 - 52 -**Option 1:** Upload to **EBRAINS Bucket** → [[Neurodiagnoses Data Storage>>url:https://wiki.ebrains.eu/bin/view/Collabs/neurodiagnoses/Bucket]] 53 -**Option 2:** Contribute via **GitHub Repository** → [[GitHub Data Repository>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/tree/main/data]] 54 - 55 -**For large datasets, please contact project administrators before uploading.** 56 - 57 -=== **Integrate Data into AI Models** === 58 - 59 -Use **Jupyter Notebooks** on EBRAINS for **data preprocessing.** 60 -Standardize data using **harmonization tools.** 61 -Train AI models with **newly integrated datasets.** 62 - 63 -**Reference:** [[Data Processing Guide>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/data_processing.md]] 64 - 65 65 ---- 66 66 67 -== **AI- Powered Annotation & Machine Learning Models** ==28 +=== **2. AI-Based Analysis** === 68 68 69 - Neurodiagnosesapplies**advanced machine learning models**to classify CNS diseases, extract features from**biomarkers and neuroimaging**, and provide **AI-powered annotation.**30 +==== **Model Development** ==== 70 70 71 -=== **AI Model Categories** === 32 +* **Embedding Models**: Use pre-trained models like BioBERT or BioLORD for text data. 33 +* **Classification Models**: 34 +** Algorithms: Random Forest, Support Vector Machines (SVM), or neural networks. 35 +** Purpose: Predict the likelihood of specific neurological conditions based on input data. 72 72 73 -|=**Model Type**|=**Function**|=**Example Algorithms** 74 -|**Probabilistic Diagnosis**|Assigns probability scores to multiple CNS disorders.|Random Forest, XGBoost, Bayesian Networks 75 -|**Tridimensional Diagnosis**|Classifies disorders based on Etiology, Biomarkers, and Neuroanatomical Correlations.|CNNs, Transformers, Autoencoders 76 -|**Biomarker Prediction**|Predicts missing biomarker values using regression.|KNN Imputation, Bayesian Estimation 77 -|**Neuroimaging Feature Extraction**|Extracts patterns from MRI, PET, EEG.|CNNs, Graph Neural Networks 78 -|**Clinical Decision Support**|Generates AI-driven diagnostic reports.|SHAP Explainability Tools 37 +==== **Dimensionality Reduction and Interpretability** ==== 79 79 80 -**Reference:** [[AI Model Documentation>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/models.md]] 39 +* Leverage [[DEIBO>>https://drive.ebrains.eu/f/8d7157708cde4b258db0/]] (Data-driven Embedding Interpretation Based on Ontologies) to connect model dimensions to ontology concepts. 40 +* Evaluate interpretability using metrics like the Area Under the Interpretability Curve (AUIC). 81 81 82 82 ---- 83 83 84 -== ** ClinicalDecision Support & Tridimensional Diagnostic Framework** ==44 +=== **3. Diagnostic Framework** === 85 85 86 - Neurodiagnosesgenerates**structuredAI reports**forclinicians, combining:46 +==== **Axes of Diagnosis** ==== 87 87 88 -**Probabilistic Diagnosis:** AI-generated ranking of potential diagnoses. 89 -**Tridimensional Classification:** Standardized diagnostic reports based on: 48 +The framework organizes diagnostic data into three axes: 90 90 91 -1. ** Axis 1:** **Etiology**→Genetic,Autoimmune,Prion,Toxic,Vascular.92 -1. ** Axis 2:** **Molecular Markers**→ CSF, Neuroinflammation,EEGbiomarkers.93 -1. ** Axis 3:** **Neuroanatomoclinical Correlations**→ MRI atrophy, PET.50 +1. **Etiology**: Genetic and environmental risk factors. 51 +1. **Molecular Markers**: Biomarkers such as amyloid-beta, tau, and alpha-synuclein. 52 +1. **Neuroanatomical Correlations**: Results from neuroimaging (e.g., MRI, PET). 94 94 95 -**Re ference:** [[Tridimensional ClassificationGuide>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/classification.md]]54 +==== **Recommendation System** ==== 96 96 56 +* Suggests additional tests or biomarkers if gaps are detected in the data. 57 +* Prioritizes tests based on clinical impact and cost-effectiveness. 58 + 97 97 ---- 98 98 99 -== ** DataSecurity,Compliance & FederatedLearning** ==61 +=== **4. Computational Workflow** === 100 100 101 -✔ **Privacy-Preserving AI**: Implements **Federated Learning**, ensuring that patient data **never leaves** local institutions. 102 -✔ **Secure Data Access**: Data remains **stored in EBRAINS MIP servers** using **differential privacy techniques.** 103 -✔ **Ethical & GDPR Compliance**: Data-sharing agreements **must be signed** before use. 63 +1. **Data Loading**: Import data from storage (Drive or Bucket). 64 +1. **Feature Engineering**: Generate derived features from the raw data. 65 +1. **Model Training**: 66 +1*. Split data into training, validation, and test sets. 67 +1*. Train models with cross-validation to ensure robustness. 68 +1. **Evaluation**: 69 +1*. Metrics: Accuracy, F1-Score, AUIC for interpretability. 70 +1*. Compare against baseline models and domain benchmarks. 104 104 105 -**Reference:** [[Data Protection & Federated Learning>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/security.md]] 106 - 107 107 ---- 108 108 109 -== ** DataProcessing & Integrationwith Clinica.Run** ==74 +=== **5. Validation** === 110 110 111 - Neurodiagnosesnow supports**Clinica.Run**, an **open-sourceneuroimaging platform**for **multimodal data processing.**76 +==== **Internal Validation** ==== 112 112 113 -=== **How It Works** === 78 +* Test the system using simulated datasets and known clinical cases. 79 +* Fine-tune models based on validation results. 114 114 115 -✔ **Neuroimaging Preprocessing**: MRI, PET, EEG data is preprocessed using **Clinica.Run pipelines.** 116 -✔ **Automated Biomarker Extraction**: Extracts volumetric, metabolic, and functional biomarkers. 117 -✔ **Data Security & Compliance**: Clinica.Run is **GDPR & HIPAA-compliant.** 81 +==== **External Validation** ==== 118 118 119 -=== **Implementation Steps** === 83 +* Collaborate with research institutions and hospitals to test the system in real-world settings. 84 +* Use anonymized patient data to ensure privacy compliance. 120 120 121 -1. Install **Clinica.Run** dependencies. 122 -1. Configure **Clinica.Run pipeline** in clinica_run_config.json. 123 -1. Run **biomarker extraction pipelines** for AI-based diagnostics. 124 - 125 -**Reference:** [[Clinica.Run Documentation>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/clinica_run.md]] 126 - 127 127 ---- 128 128 129 -== **Collaborative Development & Research** ==88 +=== **6. Collaborative Development** === 130 130 131 - **WeUseGitHubtoDevelopAI Models& StoreResearchData**90 +The project is open to contributions from researchers, clinicians, and developers. Key tools include: 132 132 133 -* **GitHub Repository:** AI model training scripts. 134 -* **GitHub Issues:** Tracks ongoing research questions. 135 -* **GitHub Wiki:** Project documentation & user guides. 92 +* **Jupyter Notebooks**: For data analysis and pipeline development. 93 +** Example: [[probabilistic imputation>>https://drive.ebrains.eu/f/4f69ab52f7734ef48217/]] 94 +* **Wiki Pages**: For documenting methods and results. 95 +* **Drive and Bucket**: For sharing code, data, and outputs. 96 +* **Collaboration with related projects: **For instance: [[//Beyond the hype: AI in dementia – from early risk detection to disease treatment//>>https://www.lethe-project.eu/beyond-the-hype-ai-in-dementia-from-early-risk-detection-to-disease-treatment/]] 136 136 137 -**We Use EBRAINS for Data & Collaboration** 138 - 139 -* **EBRAINS Buckets:** Large-scale neuroimaging and biomarker storage. 140 -* **EBRAINS Jupyter Notebooks:** Cloud-based AI model execution. 141 -* **EBRAINS Wiki:** Research documentation and updates. 142 - 143 -**Join the Project Forum:** [[GitHub Discussions>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/discussions]] 144 - 145 145 ---- 146 146 147 -** ForAdditionalDocumentation:**100 +=== **7. Tools and Technologies** === 148 148 149 -* **GitHub Repository:** [[Neurodiagnoses AI Models>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses]] 150 -* **EBRAINS Wiki:** [[Neurodiagnoses Research Collaboration>>url:https://wiki.ebrains.eu/bin/view/Collabs/neurodiagnoses/]] 151 - 152 ----- 153 - 154 -**Neurodiagnoses is Open for Contributions – Join Us Today!** 102 +* **Programming Languages**: Python for AI and data processing. 103 +* **Frameworks**: 104 +** TensorFlow and PyTorch for machine learning. 105 +** Flask or FastAPI for backend services. 106 +* **Visualization**: Plotly and Matplotlib for interactive and static visualizations. 107 +* **EBRAINS Services**: 108 +** Collaboratory Lab for running Notebooks. 109 +** Buckets for storing large datasets.