Wiki source code of Methodology

Version 19.1 by manuelmenendez on 2025/02/14 13:57

Hide last authors
manuelmenendez 19.1 1 **# Neurodiagnoses AI: Multimodal AI for Neurodiagnostic Predictions**
manuelmenendez 1.1 2
manuelmenendez 19.1 3 ## **Project Overview**
4 Neurodiagnoses AI implements AI-driven diagnostic and prognostic models for central nervous system (CNS) disorders, adapting the Florey Dementia Index (FDI) methodology to a broader set of neurological conditions. The approach integrates **multimodal data sources** (EEG, neuroimaging, biomarkers, and genetics) and employs **machine learning models** to provide **explainable, real-time diagnostic insights**.##
manuelmenendez 1.1 5
manuelmenendez 19.1 6 ## **How to Use External Databases in Neurodiagnoses**
7 To enhance diagnostic accuracy, Neurodiagnoses integrates data from multiple biomedical and neurological research databases. Researchers can follow these steps to access, prepare, and integrate data into the Neurodiagnoses framework.##
manuelmenendez 1.1 8
manuelmenendez 19.1 9 ### **Potential Data Sources**
10 Neurodiagnoses maintains an updated list of potential biomedical databases relevant to neurodegenerative diseases. ##
11
12 **Reference: List of Potential Databases**
13 - **ADNI**: Alzheimer's Disease data ([ADNI](https://adni.loni.usc.edu))
14 - **PPMI**: Parkinson’s Disease Imaging and biospecimens ([PPMI](https://www.ppmi-info.org))
15 - **GP2**: Whole-genome sequencing for PD ([GP2](https://gp2.org))
16 - **Enroll-HD**: Huntington’s Disease Clinical and genetic data ([Enroll-HD](https://www.enroll-hd.org))
17 - **GAAIN**: Multi-source Alzheimer’s data aggregation ([GAAIN](https://gaain.org))
18 - **UK Biobank**: Population-wide genetic, imaging, and health records ([UK Biobank](https://www.ukbiobank.ac.uk))
19 - **DPUK**: Dementia and Aging data ([DPUK](https://www.dementiasplatform.uk))
20 - **PRION Registry**: Prion Diseases clinical and genetic data ([PRION Registry](https://prionregistry.org))
21 - **DECIPHER**: Rare genetic disorder genomic variants ([DECIPHER](https://decipher.sanger.ac.uk))
22
23 ### **1. Register for Access**
24 - Each external database requires **individual registration** and access approval.
25 - Ensure compliance with **ethical approvals** and **data usage agreements** before integrating datasets into Neurodiagnoses.
26 - Some repositories may require a **Data Usage Agreement (DUA)** for sensitive medical data.##
27
28 ### **2. Download & Prepare Data**
29 - Download datasets while adhering to database usage policies.
30 - Ensure files meet **Neurodiagnoses format requirements**:
31 - **Tabular Data**: `.csv`, `.tsv`
32 - **Neuroimaging Data**: `.nii`, `.dcm`
33 - **Genomic Data**: `.fasta`, `.vcf`
34 - **Clinical Metadata**: `.json`, `.xml`##
35
36 - **Mandatory Fields for Integration**:
37 - **Subject ID**: Unique patient identifier
38 - **Diagnosis**: Standardized disease classification
39 - **Biomarkers**: CSF, plasma, or imaging biomarkers
40 - **Genetic Data**: Whole-genome or exome sequencing
41 - **Neuroimaging Metadata**: MRI/PET acquisition parameters
42
43 ### **3. Upload Data to Neurodiagnoses**
44 **Option 1: Upload to EBRAINS Bucket**
45 - Location: **EBRAINS Neurodiagnoses Bucket**
46 - Ensure correct **metadata tagging** before submission.##
47
48 **Option 2: Contribute via GitHub Repository**
49 - Location: **GitHub Data Repository**
50 - Create a new folder under `/data/` and include a **dataset description**.
51 - For large datasets, contact project administrators before uploading.
52
53 ### **4. Integrate Data into AI Models**
54 - Open **Jupyter Notebooks** on EBRAINS to run **preprocessing scripts**.
55 - Standardize **neuroimaging and biomarker formats** using harmonization tools.
56 - Use **machine learning models** to handle missing data and feature extraction.
57 - Train AI models with **newly integrated patient cohorts**.##
58
59 **Reference**: See `docs/data_processing.md` for detailed instructions.
60
61 ## **Collaboration & Partnerships**##
62 # **Partnering with Data Providers**
63 Neurodiagnoses seeks partnerships with data repositories to:
64 - Enable **API-based data integration** for real-time processing.
65 - Co-develop **harmonized AI-ready datasets** with standardized annotations.
66 - Secure **funding opportunities** through joint grant applications.
67
68 **Interested in Partnering?**
69 - If you represent a research consortium or database provider, reach out to explore data-sharing agreements.
70 - **Contact**: info@neurodiagnoses.com
71
72 ## **Final Notes**
73 Neurodiagnoses continuously expands its data ecosystem to support AI-driven clinical decision-making. Researchers and institutions are encouraged to contribute **new datasets and methodologies**.##
74
75 For additional technical documentation:
76 - **GitHub Repository**: [Neurodiagnoses GitHub](https://github.com/neurodiagnoses)
77 - **EBRAINS Collaboration Page**: [EBRAINS Neurodiagnoses](https://ebrains.eu/collabs/neurodiagnoses)
78
79 If you experience issues integrating data, **open a GitHub Issue** or consult the **EBRAINS Neurodiagnoses Forum**.
80
manuelmenendez 18.1 81 == **How to Use External Databases in Neurodiagnoses** ==
manuelmenendez 1.1 82
manuelmenendez 18.1 83 To enhance the accuracy of our diagnostic models, Neurodiagnoses integrates data from multiple biomedical and neurological research databases. If you are a researcher, follow these steps to access, prepare, and integrate data into the Neurodiagnoses framework.
manuelmenendez 12.2 84
manuelmenendez 18.1 85 === **Potential Data Sources** ===
manuelmenendez 12.2 86
manuelmenendez 18.1 87 Neurodiagnoses maintains an updated list of potential biomedical databases relevant to neurodegenerative diseases.
manuelmenendez 12.2 88
manuelmenendez 18.1 89 * Reference: [[List of Potential Databases>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/data/sources/list_of_potential_databases]]
manuelmenendez 12.2 90
manuelmenendez 18.1 91 === **1. Register for Access** ===
manuelmenendez 12.2 92
manuelmenendez 18.1 93 Each external database requires individual registration and access approval. Follow the official guidelines of each database provider.
manuelmenendez 12.2 94
manuelmenendez 18.1 95 * Ensure that you have completed all ethical approvals and data access agreements before integrating datasets into Neurodiagnoses.
96 * Some repositories require a Data Usage Agreement (DUA) before downloading sensitive medical data.
manuelmenendez 12.2 97
manuelmenendez 18.1 98 === **2. Download & Prepare Data** ===
manuelmenendez 12.2 99
manuelmenendez 18.1 100 Once access is granted, download datasets while complying with data usage policies. Ensure that the files meet Neurodiagnoses’ format requirements for smooth integration.
manuelmenendez 12.2 101
manuelmenendez 18.1 102 ==== **Supported File Formats** ====
manuelmenendez 12.2 103
manuelmenendez 18.1 104 * Tabular Data: .csv, .tsv
105 * Neuroimaging Data: .nii, .dcm
106 * Genomic Data: .fasta, .vcf
107 * Clinical Metadata: .json, .xml
manuelmenendez 12.2 108
manuelmenendez 18.1 109 ==== **Mandatory Fields for Integration** ====
manuelmenendez 12.2 110
manuelmenendez 18.1 111 |=Field Name|=Description
112 |Subject ID|Unique patient identifier
113 |Diagnosis|Standardized disease classification
114 |Biomarkers|CSF, plasma, or imaging biomarkers
115 |Genetic Data|Whole-genome or exome sequencing
116 |Neuroimaging Metadata|MRI/PET acquisition parameters
manuelmenendez 12.2 117
manuelmenendez 18.1 118 === **3. Upload Data to Neurodiagnoses** ===
manuelmenendez 12.2 119
manuelmenendez 18.1 120 Once preprocessed, data can be uploaded to EBRAINS or GitHub.
manuelmenendez 12.2 121
manuelmenendez 18.1 122 * (((
123 **Option 1: Upload to EBRAINS Bucket**
manuelmenendez 12.2 124
manuelmenendez 18.1 125 * Location: [[EBRAINS Neurodiagnoses Bucket>>url:https://wiki.ebrains.eu/bin/view/Collabs/neurodiagnoses/Bucket]]
126 * Ensure correct metadata tagging before submission.
127 )))
128 * (((
129 **Option 2: Contribute via GitHub Repository**
manuelmenendez 12.2 130
manuelmenendez 18.1 131 * Location: [[GitHub Data Repository>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/tree/main/data]]
132 * Create a new folder under /data/ and include dataset description.
133 )))
manuelmenendez 12.2 134
manuelmenendez 18.1 135 //Note: For large datasets, please contact the project administrators before uploading.//
manuelmenendez 12.2 136
manuelmenendez 18.1 137 === **4. Integrate Data into AI Models** ===
manuelmenendez 12.2 138
manuelmenendez 18.1 139 Once uploaded, datasets must be harmonized and formatted before AI model training.
manuelmenendez 12.2 140
manuelmenendez 18.1 141 ==== **Steps for Data Integration** ====
manuelmenendez 12.2 142
manuelmenendez 18.1 143 * Open Jupyter Notebooks on EBRAINS to run preprocessing scripts.
144 * Standardize neuroimaging and biomarker formats using harmonization tools.
145 * Use machine learning models to handle missing data and feature extraction.
146 * Train AI models with newly integrated patient cohorts.
147 * Reference: [[Detailed instructions can be found in docs/data_processing.md>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/data_processing.md]].
manuelmenendez 12.2 148
manuelmenendez 17.1 149 ----
manuelmenendez 12.2 150
manuelmenendez 18.1 151 == **Database Sources Table** ==
manuelmenendez 12.2 152
manuelmenendez 18.1 153 === **Where to Insert This** ===
manuelmenendez 12.2 154
manuelmenendez 18.1 155 * GitHub: [[docs/data_sources.md>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/data_sources.md]]
156 * EBRAINS Wiki: Collabs/neurodiagnoses/Data Sources
manuelmenendez 1.1 157
manuelmenendez 18.1 158 === **Key Databases for Neurodiagnoses** ===
manuelmenendez 12.2 159
manuelmenendez 18.1 160 |=Database|=Focus Area|=Data Type|=Access Link
161 |ADNI|Alzheimer's Disease|MRI, PET, CSF, cognitive tests|ADNI
162 |PPMI|Parkinson’s Disease|Imaging, biospecimens|[[PPMI>>url:https://www.ppmi-info.org/]]
163 |GP2|Genetic Data for PD|Whole-genome sequencing|[[GP2>>url:https://gp2.org/]]
164 |Enroll-HD|Huntington’s Disease|Clinical, genetic, imaging|[[Enroll-HD>>url:https://enroll-hd.org/]]
165 |GAAIN|Alzheimer's & Cognitive Decline|Multi-source data aggregation|[[GAAIN>>url:https://www.gaain.org/]]
166 |UK Biobank|Population-wide studies|Genetic, imaging, health records|[[UK Biobank>>url:https://www.ukbiobank.ac.uk/]]
167 |DPUK|Dementia & Aging|Imaging, genetics, lifestyle factors|[[DPUK>>url:https://www.dementiasplatform.uk/]]
168 |PRION Registry|Prion Diseases|Clinical and genetic data|[[PRION Registry>>url:https://www.prionalliance.org/]]
169 |DECIPHER|Rare Genetic Disorders|Genomic variants|DECIPHER
manuelmenendez 1.1 170
manuelmenendez 18.1 171 If you know a relevant dataset, submit a proposal in [[GitHub Issues>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/issues]].
172
manuelmenendez 1.1 173 ----
174
manuelmenendez 18.1 175 == **Collaboration & Partnerships** ==
manuelmenendez 6.1 176
manuelmenendez 18.1 177 === **Where to Insert This** ===
manuelmenendez 6.1 178
manuelmenendez 18.1 179 * GitHub: [[docs/collaboration.md>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/blob/main/docs/collaboration.md]]
180 * EBRAINS Wiki: Collabs/neurodiagnoses/Collaborations
manuelmenendez 6.1 181
manuelmenendez 18.1 182 === **Partnering with Data Providers** ===
manuelmenendez 6.1 183
manuelmenendez 18.1 184 Beyond using existing datasets, Neurodiagnoses seeks partnerships with data repositories to:
manuelmenendez 1.1 185
manuelmenendez 18.1 186 * Enable direct API-based data integration for real-time processing.
187 * Co-develop harmonized AI-ready datasets with standardized annotations.
188 * Secure funding opportunities through joint grant applications.
manuelmenendez 1.1 189
manuelmenendez 18.1 190 === **Interested in Partnering?** ===
manuelmenendez 1.1 191
manuelmenendez 18.1 192 If you represent a research consortium or database provider, reach out to explore data-sharing agreements.
manuelmenendez 1.1 193
manuelmenendez 18.1 194 * Contact: [[info@neurodiagnoses.com>>mailto:info@neurodiagnoses.com]]
manuelmenendez 1.1 195
196 ----
197
manuelmenendez 18.1 198 == **Final Notes** ==
manuelmenendez 1.1 199
manuelmenendez 18.1 200 Neurodiagnoses continuously expands its data ecosystem to support AI-driven clinical decision-making. Researchers and institutions are encouraged to contribute new datasets and methodologies.
manuelmenendez 1.1 201
manuelmenendez 18.1 202 For additional technical documentation:
manuelmenendez 1.1 203
manuelmenendez 18.1 204 * [[GitHub Repository>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses]]
205 * [[EBRAINS Collaboration Page>>url:https://wiki.ebrains.eu/bin/view/Collabs/neurodiagnoses/]]
manuelmenendez 1.1 206
manuelmenendez 18.1 207 If you experience issues integrating data, open a [[GitHub Issue>>url:https://github.com/Fundacion-de-Neurociencias/neurodiagnoses/issues]] or consult the EBRAINS Neurodiagnoses Forum.