Wiki source code of Methodology

Version 4.2 by manuelmenendez on 2025/01/29 19:10

Hide last authors
manuelmenendez 1.1 1 === **Overview** ===
2
3 This section describes the step-by-step process used in the **Neurodiagnoses** project to develop a novel diagnostic framework for neurological diseases. The methodology integrates artificial intelligence (AI), biomedical ontologies, and computational neuroscience to create a structured, interpretable, and scalable diagnostic system.
4
5 ----
6
7 === **1. Data Integration** ===
8
9 ==== **Data Sources** ====
10
11 * **Biomedical Ontologies**:
12 ** Human Phenotype Ontology (HPO) for phenotypic abnormalities.
13 ** Gene Ontology (GO) for molecular and cellular processes.
14 * **Neuroimaging Datasets**:
15 ** Example: Alzheimer’s Disease Neuroimaging Initiative (ADNI), OpenNeuro.
16 * **Clinical and Biomarker Data**:
17 ** Anonymized clinical reports, molecular biomarkers, and test results.
18
manuelmenendez 4.2 19
manuelmenendez 1.1 20 ==== **Data Preprocessing** ====
21
22 1. **Standardization**: Ensure all data sources are normalized to a common format.
23 1. **Feature Selection**: Identify relevant features for diagnosis (e.g., biomarkers, imaging scores).
24 1. **Data Cleaning**: Handle missing values and remove duplicates.
25
26 ----
27
28 === **2. AI-Based Analysis** ===
29
30 ==== **Model Development** ====
31
32 * **Embedding Models**: Use pre-trained models like BioBERT or BioLORD for text data.
33 * **Classification Models**:
34 ** Algorithms: Random Forest, Support Vector Machines (SVM), or neural networks.
35 ** Purpose: Predict the likelihood of specific neurological conditions based on input data.
36
37 ==== **Dimensionality Reduction and Interpretability** ====
38
manuelmenendez 3.1 39 * Leverage [[DEIBO>>https://drive.ebrains.eu/f/8d7157708cde4b258db0/]] (Data-driven Embedding Interpretation Based on Ontologies) to connect model dimensions to ontology concepts.
manuelmenendez 1.1 40 * Evaluate interpretability using metrics like the Area Under the Interpretability Curve (AUIC).
41
42 ----
43
44 === **3. Diagnostic Framework** ===
45
46 ==== **Axes of Diagnosis** ====
47
48 The framework organizes diagnostic data into three axes:
49
50 1. **Etiology**: Genetic and environmental risk factors.
51 1. **Molecular Markers**: Biomarkers such as amyloid-beta, tau, and alpha-synuclein.
52 1. **Neuroanatomical Correlations**: Results from neuroimaging (e.g., MRI, PET).
53
54 ==== **Recommendation System** ====
55
56 * Suggests additional tests or biomarkers if gaps are detected in the data.
57 * Prioritizes tests based on clinical impact and cost-effectiveness.
58
59 ----
60
61 === **4. Computational Workflow** ===
62
63 1. **Data Loading**: Import data from storage (Drive or Bucket).
64 1. **Feature Engineering**: Generate derived features from the raw data.
65 1. **Model Training**:
66 1*. Split data into training, validation, and test sets.
67 1*. Train models with cross-validation to ensure robustness.
68 1. **Evaluation**:
69 1*. Metrics: Accuracy, F1-Score, AUIC for interpretability.
70 1*. Compare against baseline models and domain benchmarks.
71
72 ----
73
74 === **5. Validation** ===
75
76 ==== **Internal Validation** ====
77
78 * Test the system using simulated datasets and known clinical cases.
79 * Fine-tune models based on validation results.
80
81 ==== **External Validation** ====
82
83 * Collaborate with research institutions and hospitals to test the system in real-world settings.
84 * Use anonymized patient data to ensure privacy compliance.
85
86 ----
87
88 === **6. Collaborative Development** ===
89
90 The project is open to contributions from researchers, clinicians, and developers. Key tools include:
91
92 * **Jupyter Notebooks**: For data analysis and pipeline development.
manuelmenendez 4.1 93 ** Example: [[probabilistic imputation>>https://drive.ebrains.eu/f/4f69ab52f7734ef48217/]]
manuelmenendez 1.1 94 * **Wiki Pages**: For documenting methods and results.
95 * **Drive and Bucket**: For sharing code, data, and outputs.
manuelmenendez 4.2 96 * **Related projects: **For instance: [[//Beyond the hype: AI in dementia – from early risk detection to disease treatment//>>https://www.lethe-project.eu/beyond-the-hype-ai-in-dementia-from-early-risk-detection-to-disease-treatment/]]
manuelmenendez 1.1 97
98 ----
99
100 === **7. Tools and Technologies** ===
101
102 * **Programming Languages**: Python for AI and data processing.
103 * **Frameworks**:
104 ** TensorFlow and PyTorch for machine learning.
105 ** Flask or FastAPI for backend services.
106 * **Visualization**: Plotly and Matplotlib for interactive and static visualizations.
107 * **EBRAINS Services**:
108 ** Collaboratory Lab for running Notebooks.
109 ** Buckets for storing large datasets.