Attention: The EBRAINS drive will be unavailable for most of the weekend starting the 25th October. Although the Lab is availble while the Drive is down, files that are stored in the Drive will not be loaded and you will be unable to save documents directly on the Lab.


Changes for page data-curation-copy

Last modified by eapapp on 2023/07/04 16:46

From version 147.2
edited by ingrreit
on 2023/06/05 09:28
Change comment: There is no comment for this version
To version 147.4
edited by ingrreit
on 2023/06/05 10:17
Change comment: There is no comment for this version

Summary

Details

Page properties
Content
... ... @@ -55,7 +55,7 @@
55 55  
56 56  ----
57 57  
58 -=== Step by step - Experimental data ===
58 +=== Step by step - Data ===
59 59  
60 60  
61 61  [[image:image-20230326054341-1.png]]
... ... @@ -64,45 +64,79 @@
64 64  
65 65  ==== **1. Provide some general information about your dataset** ====
66 66  
67 -The [[Curation request form>>https://nettskjema.no/a/277393#/]] collects preliminary information about your data, allowing us to assess whether the dataset fits within the scope of EBRAINS. The submission generates a curation ID allowing us to track the case.
68 68  
69 -The [[Ethics and Regulatory compliance form>>https://nettskjema.no/a/224765]] collects the necessary information needed for us to evaluate whether we can safely and legally share the data on the EBRAINS platforms.
68 +[[image:https://lh3.googleusercontent.com/zh7TvO6w04YGW9jIhfhmdT6CexdGs-AWOLfJXKRq7-tdHOu6ar1rOQx8o4rZevrjXqgPZ7-Ejv4b6X9XpgXuHpdUXi-mBTHIUnv5Vz-DktHt0sP-PZ3gE8XgZid3TV3swV1uTCBhHx11ge0pjP7RVxswGQ=s2048||height="85px;" width="91px;"]]** Fill in the [[Curation request form>>https://nettskjema.no/a/277393#/]]. **This form collects preliminary information about your data, allowing us to assess whether the dataset fits within the scope of EBRAINS. The submission generates a curation ID allowing us to track the case.
70 70  
71 71  
71 +[[image:https://lh6.googleusercontent.com/yw442oS6BwZOlY-_0BoVxyCW3DrdcJ5ogCes92iOD16_rgNEVk56aNMDaVWXFfBLYv24bHzmGgBF9wg0szjH70xzuRTqxoQAeuy3knNO7axCHoyZDXwtyTcMgFnYwbOYxOT29LK-zchrUKLW6Mle93kOkQ=s2048||height="94px;" width="94px;"]]**Fill in the [[Ethics and Regulatory compliance form>>https://nettskjema.no/a/224765]]**. This form collects the necessary information needed for us to evaluate whether we can ethically and legally share the data via EBRAINS.
72 +
73 +
72 72  ==== **2. Upload data ** ====
73 73  
74 -EBRAINS offers secure, long-term storage at [[CSCS Swiss National Supercomputing Centre>>url:https://www.cscs.ch/]], with currently no upper limit of storage capacity. The data must be consistently structured prior to upload. 
75 75  
76 -For smaller datasets with a reasonable amount of files, we recommend using the Collab-Bucket solution (drag-and-drop). A Collab Bucket must first be assigned to a dataset, which happens when a datasets is accepted for sharing.
77 +[[image:https://lh5.googleusercontent.com/sieKO-kW8O18iPaUyonwyo4UfHBmtc2E9BDnjbx52j6J_uGmm-OzGAo7sloMk3sYwKa6QW3hYQsOA9N4H7uGQpca088Wrk0Nurpt_J3B0-NSbcaPNdZIh21otQcG6jnAxLGiKoEvkTyaDGTMk3fu7me8mQ=s2048||height="94px;" width="94px;"]]**Ensure data is structured consistently prior to upload. **We look for organized data, not organized according to our standard. This is to support the broadest degree of sharing possible. We do however require that the data is organized in a consistent and precise manner. Please see our documentation for further guidance.
77 77  
78 -For larger datasets or datasets with a large amount of files, we recommend using a programmatic approach. The [[python script>>https://github.com/eapapp/ebrains-data-storage/tree/main/data-proxy]] is interactive and does not require any additional programming.
79 79  
80 +[[image:https://lh5.googleusercontent.com/EWtYwfVlbeC-jqPasgmzidqc50GrkKIEgwXeUeql8aaMHIukmFdWEy0nufVWWATbxDDK3XwwZEDmASrbpCsBk1u0HpAd8x4ZgAMsMPRcWyrb9etlV6FgKE_QN2e6SqKxHE0rzkR8uI1rRW_5z21TFGYVnw=s2048||height="91px;" width="91px;"]]**Upload data to EBRAINS Storage, either using a drag-and-drop solution (opt. 1) or an interactive python script (opt. 2).**
80 80  
82 +//Opt. 1. //For smaller datasets with a reasonable amount of files, we recommend using the Collab-Bucket solution (drag-and-drop). A Collab Bucket must first be assigned to a dataset, which happens when a datasets is accepted for sharing.
83 +
84 +//Opt. 2. //For larger datasets or datasets with a large amount of files, we recommend using a programmatic approach. The [[python script>>https://github.com/eapapp/ebrains-data-storage/tree/main/data-proxy]] is interactive and does not require any additional programming.
85 +
86 +
87 +EBRAINS offers secure, long-term storage at [[CSCS Swiss National Supercomputing Centre>>url:https://www.cscs.ch/]], with currently no upper limit of storage capacity. 
88 +
81 81  If a data collection is already uploaded elsewhere, we may link to the already existing repository.
82 82  
83 83  
84 84  ==== **3. Submit metadata** ====
85 85  
86 -Easily submit openMINDS-compatible metadata via our [[metadata wizard>>https://ebrains-metadata-wizard.apps.hbp.eu/]]. This form covers all the required metadata for sharing data via EBRAINS. When you're ready to 'Submit', the metadata and all uploaded files will be sent to the Curation team.
87 87  
88 -For power-users interested in exploring the full span of the openMINDS framework, please check out the [[openMINDS GitHub>>https://github.com/HumanBrainProject/openMINDS]] to learn more about how to programmatically gather your metadata. A stable version of the openMINDS package can be found on [[PyPi>>https://pypi.org/project/openMINDS/]]. We accept openMINDS metadata as JSON-LD (share these with us via curation-support@ebrains.eu). Additional documentation of openMINDS metadata submodules and schemas can be found on [[the openMINDS GitHub Wiki>>https://humanbrainproject.github.io/openMINDS/]]. We have prepared [[a list of the metadata properties that are required>>https://drive.ebrains.eu/lib/47995dbc-f576-4008-a76c-eefbfd818529/file/ebrains-minimum-required-metadata.xlsx]] for publishing data on EBRAINS.
95 +[[image:https://lh5.googleusercontent.com/WS4T2LhF9znWWChn3Z550agLrrb-KTWdYVsJSv0lh4cGjKbjuN1WV68WER9xkYqi1UqN7KYZz7bImYz3_TpOuTuvma7T192QUiUZoyJVPk1fj5NSDSQh_kpIeBufAOdDtsDRpPKK_P5EDPqRCTAaOTNyCw=s2048||height="91px;" width="91px;"]]**Submit metadata using our **[[EBRAINS Metadata wizard>>https://ebrains-metadata-wizard.apps.hbp.eu/]]** (opt. 1), or through direct interaction with the Knowledge Graph (opt. 2) **
89 89  
97 +//Opt. 1.// Manually submit the minimal required metadata via the [[EBRAINS Metadata wizard>>https://ebrains-metadata-wizard.apps.hbp.eu/]]. The minimal required metadata covers extended bibliographic information necessary to publish your dataset on EBRAINS. The submitted information, including uploaded files, will be sent to the Curation team automatically
90 90  
99 +//Opt. 2.// To go beyond the minimal required metadata, you can directly interact with the Knowledge Graph (KG) in your private space. Within the private space, you can upload metadata and interact with them, moreover you can connect your metadata to existing publicly accessible entries. Access to your private space is granted upon the initiation of the curation process. You can access your private space via:
100 +
101 +* Knowledge Graph Editor: This User Interface allows you to manually enter metadata into your KG space and validate metadata that are programmatically uploaded. The Editor contains a basic set of openMINDS metadata templates, but can be extended to the full openMINDS metadata model on request. Access is granted once the request is accepted.
102 +* [[Fairgraph>>https://fairgraph.readthedocs.io/en/stable/]]: This is the recommended software tool for programmatic interaction with the KG. It allows you to programmatically upload openMINDS compliant metadata into your KG space and interact with existing metadata.
103 +* [[KG Core Python SDK>>https://github.com/HumanBrainProject/kg-core-sdks]]: This python package gives you full freedom in interacting with he KG. It allows you to upload any JSON-LD with metadata into your private space. Note, for dataset publications in EBRAINS, the JSON-LD metadata files have to comply to openMINDS.
104 +
105 +
106 +Datasets published through the EBRAINS Knowledge Graph have to be registered using **openMINDS compliant metadata** delivered as JSON-LD files. See this summary table for an overview of [[the minimally required openMINDS properties for publishing>>https://drive.ebrains.eu/lib/47995dbc-f576-4008-a76c-eefbfd818529/file/ebrains-minimum-required-metadata.xlsx]] on EBRAINS.
107 +
108 +
109 +**The openMINDS metadata framework**
110 +
111 +openMINDS (open Metadata Initiative for Neuroscience Data Structures) is a community-driven, open-source metadata framework for graph database systems, such as the EBRAINS Knowledge Graph. It is composed of linked metadata models, libraries of serviceable metadata instances, and supportive tooling ([[openMINDS Python>>url:https://pypi.org/project/openMINDS/]], openMINDS Matlab). For exploring the openMINDS schemas, go to the [[HTML documentation>>url:https://humanbrainproject.github.io/openMINDS/]]. For a full overview of the framework, go to [[the openMINDS collab>>url:https://wiki.ebrains.eu/bin/view/Collabs/openminds/]] or the [[GitHub repository>>https://github.com/HumanBrainProject/openMINDS]].
112 +
113 +For feedback, requests, or contributions, please get in touch with the openMINDS development team via
114 +
115 +* the support-email: [[openminds@ebrains.eu>>path:mailto:openminds@ebrains.eu]]
116 +* the [[GitHub issue tracker>>url:https://github.com/HumanBrainProject/openMINDS/issues]]
117 +* the INCF NeuroStars [[openMINDS Community Forum>>url:https://neurostars.org/t/openminds-community-forum-virtual/20156]]
118 +
119 +
91 91  ==== **4. Write a Data Descriptor ** ====
92 92  
93 -The Data Descriptor is a document helping others interpret and reuse (and prevent misuse) of your data, and is critical to achieve a basic level of FAIR. The document will be uploaded in the repository of the data, shared as a PDF. 
94 94  
95 -[[The template >>https://drive.ebrains.eu/f/a2e07c95b1a54090bbbc/?dl=1]]safely guides you through the process of making one. Check out previous examples in the KG Search, e.g. the Data Descriptor for a dataset containing histology images of the rat brain stained for an anterograde tracer (see [[an example>>https://doi.org/10.25493/2MX9-3XF]]).
123 +[[image:https://lh4.googleusercontent.com/lMYEKOXzejbBydOdotWWteXQo7j363xRyntBGjcPZVEdtIU1CJYX7q1STpdr2JPZK4hpWWXk20UlkUOqDGL5kX6vnQVBSdrfUo6EGfXOwpuGq1Uygv0tTZJ0lRO6voJvg56QC2mufvjAcRXGfAKFOjtc6w=s2048||height="94px;" width="94px;"]]**Write a data descriptor by filling in **[[this template>>https://drive.ebrains.eu/f/a2e07c95b1a54090bbbc/?dl=1]]** . **The Data Descriptor is a document helping others interpret and reuse (and prevent misuse) of your data, and is critical to achieve a basic level of FAIR. The document will be uploaded in the repository of the data, shared as a PDF. 
96 96  
97 97  
126 +Check out previous examples in the KG Search! See e.g., the data descriptor for the dataset "[[Anterogradely labeled axonal projections from the orbitofrontal cortex in rat>>https://doi.org/10.25493/2MX9-3XF]]".
127 +
98 98  Journal publications sufficiently describing the shared data, such as made available through [[Nature Scientific Data>>http://www.nature.com/sdata/about]], [[Elsevier Data in Brief>>http://www.journals.elsevier.com/data-in-brief/]], [[BMC Data note>>https://bmcresnotes.biomedcentral.com/submission-guidelines/preparing-your-manuscript/data-note]] and more, can replace the EBRAINS Data Descriptor.
99 99  
100 100  
131 +|(% style="width:175px" %)[[[[image:image-20230324171109-1.png||height="154" width="109"]]>>https://drive.ebrains.eu/f/c1ccb78be52e4bdba7cf/]]|(% style="width:1662px" %)The EBRAINS Data descriptor at-a-glance
132 +
133 +
101 101  ==== **5. Preview and publish ** ====
102 102  
103 -A Curator will assemble a dataset in the EBRAINS Knowledge Graph that combines the data, metadata and data descriptor. Once ready, the data provider will receive a private URL for previewing the dataset prior to release. We need an official approval from the data custodian{{footnote}}The Data Custodian is responsible for the content and quality of the Data and metadata, and is the person to be contacted by EBRAINS CS in case of any misconduct related to the Data. It is the obligation of a Data Custodian to keep EBRAINS informed about changes in the contact information of the authors of the Datasets provided by them ([[EBRAINS Data Provision Protocol - version 1.1>>https://strapi-prod.sos-ch-dk-2.exo.io/EBRAINS_Data_Provision_Protocol_dfe0dcb104.pdf]]).{{/footnote}} to release the dataset. Once released, a [[DataCite DOI>>https://datacite.org/]] will be generated for the dataset. If the identical data collection has received a DOI elsewhere, we recommend re-using the already issued DOI.
104 104  
137 +[[image:https://lh4.googleusercontent.com/XqT26Q4yWJK26cjtjhI4ToXoZZMxhT9LimG4Hk9mePxy0-KPKgpVIzcuiP5mOQowBgf2JjkrWUq2VbCmafWWZPJplEZALnFOlCZHLlQgzOx7fFwoBteyi_IlMLkPBS9vtOcdNIZ59HyLnQz4RsTQ0lUrSw=s2048||height="91px;" width="91px;"]]**Preview and approve the release of your dataset. **Once a Curator has assembled the dataset in the EBRAINS Knowledge Graph, combining the data, metadata and data descriptor, the data provider will receive a private URL for previewing the dataset prior to release. We need an official approval from the data custodian{{footnote}}The Data Custodian is responsible for the content and quality of the Data and metadata, and is the person to be contacted by EBRAINS CS in case of any misconduct related to the Data. It is the obligation of a Data Custodian to keep EBRAINS informed about changes in the contact information of the authors of the Datasets provided by them ([[EBRAINS Data Provision Protocol - version 1.1>>https://strapi-prod.sos-ch-dk-2.exo.io/EBRAINS_Data_Provision_Protocol_dfe0dcb104.pdf]]).{{/footnote}} to release the dataset. Once released, a [[DataCite DOI>>https://datacite.org/]] will be generated for the dataset. If the identical data collection has received a DOI elsewhere, we recommend re-using the already issued DOI.
105 105  
139 +
106 106  ----
107 107  
108 108  ==== **Sharing human data ** ====
... ... @@ -109,42 +109,36 @@
109 109  
110 110  (% class="box floatinginfobox" %)
111 111  (((
112 -For **Human subject data**, the data must be //either//
146 +**Human subject data that can be shared on EBRAINS consists of:**
113 113  
114 114  - Post-mortem data
115 115  - Aggregated data
116 -- Pseudonymized subject data with a legal basis for sharing (e.g. Informed Consent)
150 +- Strongly pseudonymized or de-identified subject data with a legal basis for sharing (e.g. Informed Consent)
117 117  
118 -(% class="small" %)//If you have human data that do not classify as any of the above, please get in touch and we will clarify the available options. //
152 +(% class="small" %)//If you have human data that does not classify as any of the above, please get in touch and we will clarify the available options. //
119 119  )))
120 120  
121 -We must ensure data shared on EBRAINS comply with [[GDPR >>https://gdpr-info.eu/]]and [[EU directives>>https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=celex%3A32010L0063]]. The information we need to assess this is collected via our [[Ethics and Regulatory Compliance Survey>>https://nettskjema.no/a/224765]].
155 +Human data shared on EBRAINS must comply with [[GDPR >>https://gdpr-info.eu/]]and [[EU directives>>https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=celex%3A32010L0063]]. The information we need to assess this is collected via our [[Ethics and Regulatory Compliance Survey>>https://nettskjema.no/a/224765]].
122 122  
123 -Post-mortem and aggregated human data can be shared openly. Pseudonymized data is shared via the Human Data Gateway (HDG) due to GDPR regulations. The HDG adds an authentication layer to the data.
157 +Post-mortem and aggregated human data can be shared openly, given direct identifiers in the metadata are removed. Strongly pseudonymized and de-identified data can be shared via the Human Data Gateway (HDG).
124 124  
125 -**Data users** must request access to the data (via their EBRAINS account) and will receive access provided they actively accept the [[EBRAINS Access Policy>>https://ebrains.eu/terms#access-policy]], the [[EBRAINS General Terms of Use>>https://ebrains.eu/terms#general-terms-of-use]], and the [[EBRAINS Data Use Agreement>>https://ebrains.eu/terms#data-use-agreement]]. The account holder also have to accept that information about their request and access to specific data under HDG is being tracked and stored.
126 -\\**Data owners** must be aware that sharing under the HDG affects the legal responsibilities for the data. They must agree to joint control of the data (see the [[Data Provision Protocol v1>>url:https://strapi-prod.sos-ch-dk-2.exo.io/EBRAINS_Data_Provision_Protocol_dfe0dcb104.pdf]], section 1.4 - 1.5) and the Data Protection Officers of the responsible institutions must have accepted that the data can be shared under HDG.
127 -\\The Human Data Gateway (HDG) was introduced in February 2021 and developed across multiple teams in the HBP. The initiative to create the service and the initial design originated from EBRAINS Curation in close collaboration with the Data compliance team and the HBP Data Governance Working Group. HDG is a response to the needs of multiple data providers who are bringing data of human origin to EBRAINS. HDG covers the sharing of a limited range of data of human origin, i.e., data without direct identifiers and with very few indirect identifiers (strongly pseudonymized, de-identified). It is an extension of the existing services and does not replace the future EBRAINS Service for sensitive data (planned for 2024) which is outside the domain of the current EBRAINS Data and Knowledge services.
159 +The Human Data Gateway (HDG) was introduced in February 2021 as a response to the needs of multiple data providers who are bringing human subject data to EBRAINS. HDG covers the sharing of strongly pseudonymized or de-identified data, a limited range human subject data without direct identifiers and with very few indirect identifiers.
128 128  
161 +The HDG adds an an authentication layer on top of the data. This means that **data users **must request access to the data (via their EBRAINS account) and will receive access provided they actively accept the [[EBRAINS Access Policy>>https://ebrains.eu/terms#access-policy]], the [[EBRAINS General Terms of Use>>https://ebrains.eu/terms#general-terms-of-use]], and the [[EBRAINS Data Use Agreement>>https://ebrains.eu/terms#data-use-agreement]]. The account holder also have to accept that information about their request and access to specific data under HDG is being tracked and stored. **Data owners** must be aware that sharing under the HDG affects the legal responsibilities for the data. They must agree to joint control of the data (see the [[Data Provision Protocol v1>>url:https://strapi-prod.sos-ch-dk-2.exo.io/EBRAINS_Data_Provision_Protocol_dfe0dcb104.pdf]], section 1.4 - 1.5) and the Data Protection Officers of the responsible institutions must have accepted that the data can be shared under HDG.
129 129  
163 +The HDG is an extension of the existing services and does not replace the future EBRAINS Service for sensitive data (planned for 2024) which is outside the domain of the current EBRAINS Data and Knowledge services.
164 +
165 +
130 130  ----
131 131  
132 132  === Step by Step - Models ===
133 133  
134 -(% style="color:#e74c3c" %){{mention reference="XWiki.adavison" style="FULL_NAME" anchor="XWiki-adavison-1rb0hn"/}}
135 135  
136 -[place-holder-process-diagram]
171 +~1. Request curation using the [[Curation request form>>https://nettskjema.no/a/277393#/]]. You will be contacted by a curator with more information.
137 137  
138 -==== **1. model step 1 ** ====
139 139  
140 -Text
174 +//Additional information will be added soon.//
141 141  
142 -
143 -==== **2. model step 2** ====
144 -
145 -Text
146 -
147 -
148 148  ----
149 149  
150 150  === Step by Step - Code ===
... ... @@ -162,14 +162,13 @@
162 162  
163 163  ----
164 164  
165 -=== Webservices and metadata models ===
193 +== **The curation team: meet the curators** ==
166 166  
167 -(% class="wikigeneratedid" id="HContact...." %)
168 -(% style="color:#e74c3c; font-size:16px" %){{mention reference="XWiki.adavison" style="FULL_NAME" anchor="XWiki-adavison-np253c"/}}(% style="color:#4a5568; font-size:16px" %)
195 +The EBRAINS curators help researchers publish their research using the EBRAINS Research Infrastructure. A curator’s job is similar to the job of an editor of a scientific journal, checking the data is organized, understandable, accessible and sufficiently described.
169 169  
170 -----
171 171  
172 -== **The curation team: meet the curators** ==
198 +The curators in EBRAINS are located in Oslo, Jülich, Trier and Paris. 
199 +
173 173  
174 174  **Located in Norway:**
175 175  
... ... @@ -311,7 +311,7 @@
311 311  |(% style="width:439px" %)(((
312 312  [[[[image:image-20230324171114-2.png||height="354" width="250"]]>>https://drive.ebrains.eu/f/dfd374b9b43a458192e9/]]
313 313  )))|(% style="width:461px" %)(((
314 -[[[[image:image-20230324171109-1.png||height="352" width="250"]]>>https://drive.ebrains.eu/f/c1ccb78be52e4bdba7cf/]]
341 +
315 315  )))|(% style="width:416px" %)[[[[image:image-20230330120354-1.png||height="352" width="250"]]>>https://drive.ebrains.eu/f/707147a883b94fae8e69/]]
316 316  |(% style="width:439px" %)//Collection of useful information for researchers looking to share experimental data on EBRAINS.//|(% style="width:461px" %)//The EBRAINS data descriptor: a general overview //|(% style="width:416px" %)//Introduction to data organization: A [[collection of guidelines>>https://drive.ebrains.eu/smart-link/25299f04-c4e5-4028-8f5f-3b8208f9a532/]] on how to organise files and folders to ensure consistency and reproducibility in the future. //
317 317