Changes for page 3 Try it yourself!
Last modified by maaike on 2022/07/06 10:08
Summary
-
Page properties (1 modified, 0 added, 0 removed)
-
Attachments (0 modified, 1 added, 0 removed)
Details
- Page properties
-
- Content
-
... ... @@ -6,7 +6,7 @@ 6 6 1. Find datasets that contain NIfTI files 7 7 1. Find the software that can be used to open .smr file formats? 8 8 9 -[[ https:~~/~~/lab.ch.ebrains.eu/hub/user-redirect/lab/tree/shared/Practical%20Guide%20to%20Using%20the%20EBRAINS%20Knowledge%20Graph%20in%20(your)%20Research%20-%20User%20Examples/query.ipynb>>https://lab.ch.ebrains.eu/hub/user-redirect/lab/tree/shared/Practical%20Guide%20to%20Using%20the%20EBRAINS%20Knowledge%20Graph%20in%20(your)%20Research%20-%20User%20Examples/query.ipynb]]9 +[[Log in to your EBRAINS account and try running the examples yourself!>>https://lab.ch.ebrains.eu/hub/user-redirect/lab/tree/shared/Practical%20Guide%20to%20Using%20the%20EBRAINS%20Knowledge%20Graph%20in%20(your)%20Research%20-%20User%20Examples/query.ipynb]] 10 10 11 11 === Example 1 - How many datasets used human subjects? === 12 12 ... ... @@ -14,11 +14,17 @@ 14 14 15 15 In the filter function, select "Homo Sapiens" under species. This filters the available datasets in the Knowledge Graph for human subjects only. 16 16 17 -[[https: ~~/~~/search.kg.ebrains.eu/?facet_type[0]=Dataset&facet_Dataset_speciesFilter[0]=Homo%20sapiens>>https://search.kg.ebrains.eu/?facet_type[0]=Dataset&facet_Dataset_speciesFilter[0]=Homo%20sapiens]]17 +[[[[image:HumanData.png]]>>https://search.kg.ebrains.eu/?facet_type[0]=Dataset&facet_Dataset_speciesFilter[0]=Homo%20sapien&category=Dataset&species[0]=Homo%20sapiens]] 18 18 19 +[[https:~~/~~/search.kg.ebrains.eu/?facet_type[0]=Dataset&facet_Dataset_speciesFilter[0]=Homo%20sapien&category=Dataset&species[0]=Homo%20sapiens>>https://search.kg.ebrains.eu/?facet_type[0]=Dataset&facet_Dataset_speciesFilter[0]=Homo%20sapien&category=Dataset&species[0]=Homo%20sapiens]] 20 + 19 19 ==== **Query Builder** ==== 20 20 21 -To search for datasets containing human subjects only, you can first declare the id (to get the involved instances) as well as a link to the "studied specimen" (you can add a type filter and restrict it to "Subject" and "Subject group" only since we are not interested in "Tissue samples" and "Tissue sample collections" in this moment). From "studied specimen", we're interested in the "Species" (here you can - again add a type filter to exclude "Strain" since this is irrelevant for human subjects). For the "Species", we want the "label" to contain "homo sapiens" which is why we add a filter "contains" with the value "homo sapiens". We might want to simplify the deeply nested structure by "flattening" both, the "Studied specimen" as well as the "Species". Once we go to the "execute query" section (the play button on the left) and we run the query, we can see the total number of dataset versions. 23 +To search for datasets containing human subjects only, we will therefore execute the query against the "dataset version" data structure. We want to know the persistent identifier and name of the dataset version, so we declare the "id" and "lookup label" first. Since our objective is to filter dataset based on the species of the subjects, we need to specify "sudied specimen" in our query too. We have four specimen categories, "subjects", "subject group", "tissue sample", and "tissue sample collection". We add a "type filter" to restrict our results to "Subject" and "Subject group" since we are not currently not interested in "Tissue samples" and "Tissue sample collections". To ensure that we only get datasets with human subjects, we can define the "Species" under "studied specimen" (again, you can add a type filter to exclude "Strain" since this is irrelevant for human subjects). For the "Species", we want the "label" to contain "homo sapiens" which is why we add a filter "CONTAINS" with the value "Homo sapiens". 24 + 25 +For graph databases, like the EBRAINS Knowledge Graph, it is very easy to create very long and complex queries. We can simplify deeply nested structures by "flattening" the query. This is only possible when a property only has 1 nested property ("child"). In our query, this is the case for the "Studied specimen" and the "Species". 26 + 27 +Once we have build your query, we can go to the "execute query" section (the play button on the left) and run the query, we can see the total number of dataset versions. 22 22 \\Please note that this number can differ from the one you figured out in the search UI. The reason for this is, that the search UI does only count the newest dataset version whilst the query also returns older dataset versions. 23 23 24 24 {{code language="json" layout="LINENUMBERS"}} ... ... @@ -41,6 +41,10 @@ 41 41 }, 42 42 "structure": [ 43 43 { 50 + "propertyName": "query:shortName", 51 + "path": "https://openminds.ebrains.eu/vocab/shortName" 52 + }, 53 + { 44 44 "propertyName": "query:id", 45 45 "path": "@id" 46 46 }, ... ... @@ -82,12 +82,16 @@ 82 82 83 83 ==== **Search UI** ==== 84 84 85 -All the metadata in the knowledge graph is represented by nodes and their relationships by the edges. Most of the "basic" metadata is visualised in the Search UI to make it easy for the user to find datasets that fit certain criteria without needing to know how to navigate and traverse a graph structure. 95 +All the metadata in the knowledge graph is represented by nodes and their relationships by the edges. Most of the "basic" metadata is visualised in the Search UI to make it easy for the user to find datasets that fit certain criteria without needing to know how to navigate and traverse a graph structure. When searching for "male adult" subjects in the search UI, we find datasets that have these keywords in any of the text summarised on the dataset card (it is a 'fuzzy search' : [[https:~~/~~/search.kg.ebrains.eu/?category=Dataset&q=male%20and%20adult>>https://search.kg.ebrains.eu/?category=Dataset&q=male%20and%20adult]]). To ensure we only look for the any specimen (subjects or samples) originating from male adult mice, we need to write a query and extract the metadata programmatically. 86 86 87 87 ==== **Query Builder** ==== 88 88 89 -For dataset versions that use male adult subjects, we can filter datasets using these 2 properties. The easiest way is to add a required filter to biological sex that is "EQUAL" to "male" and the age category "EQUAL" to "adult". By selecting the filter "EQUAL" instead of "CONTAINS", we ensure that only datasets with adult animals are found. If we want to be more general and include all subjects from the onset of sexual maturity, we can use "CONTAINS" instead as this will include subjects with the age category "prime adult", "young adult" and "late adult" as well. 99 +For dataset versions that use male adult subjects, we can filter datasets using these 2 properties. The easiest way is to add a required filter to biological sex that is "EQUAL" to "male" and the age category "EQUAL" to "adult". By selecting the filter "EQUAL" instead of "CONTAINS", we ensure that only datasets with adult animals are found. If we want to be more general and include all subjects from the onset of sexual maturity, we can use "CONTAINS" instead as this will include subjects with the age category "prime adult", "young adult" and "late adult" as well. 90 90 101 +We have again taken advantage of the type filter (set it to subjects and subject groups), and we flattened the query where possible. This means that you now find multiple elements in the "path" for "biological sex" and for the "age category". 102 + 103 +Try it yourself and check out the differences between the results of the flattened and unflattened queries! 104 + 91 91 {{code language="json" layout="LINENUMBERS"}} 92 92 { 93 93 "@context": { ... ... @@ -108,54 +108,60 @@ 108 108 }, 109 109 "structure": [ 110 110 { 111 - "propertyName": "query:id", 112 - "path": "@id" 113 - }, 114 - { 115 115 "propertyName": "query:shortName", 116 116 "path": "https://openminds.ebrains.eu/vocab/shortName" 117 117 }, 118 118 { 129 + "propertyName": "query:id", 130 + "path": "@id" 131 + }, 132 + { 119 119 "propertyName": "query:studiedSpecimen", 120 - "path": "https://openminds.ebrains.eu/vocab/studiedSpecimen", 134 + "path": { 135 + "@id": "https://openminds.ebrains.eu/vocab/studiedSpecimen", 136 + "typeFilter": [ 137 + { 138 + "@id": "https://openminds.ebrains.eu/core/Subject" 139 + }, 140 + { 141 + "@id": "https://openminds.ebrains.eu/core/SubjectGroup" 142 + } 143 + ] 144 + }, 121 121 "required": true, 122 122 "structure": [ 123 123 { 148 + "propertyName": "query:lookupLabel", 149 + "path": "https://openminds.ebrains.eu/vocab/lookupLabel" 150 + }, 151 + { 124 124 "propertyName": "query:id", 125 125 "path": "@id" 126 126 }, 127 127 { 128 128 "propertyName": "query:biologicalSex", 129 - "path": "https://openminds.ebrains.eu/vocab/biologicalSex", 130 130 "required": true, 131 - "structure": { 132 - "propertyName": "query:name", 133 - "path": "https://openminds.ebrains.eu/vocab/name", 134 - "required": true, 135 - "filter": { 136 - "op": "EQUALS", 137 - "value": "male" 138 - } 139 - } 158 + "filter": { 159 + "op": "EQUALS", 160 + "value": "male" 161 + }, 162 + "path": [ 163 + "https://openminds.ebrains.eu/vocab/biologicalSex", 164 + "https://openminds.ebrains.eu/vocab/name" 165 + ] 140 140 }, 141 141 { 142 142 "propertyName": "query:studiedState", 143 - "path": "https://openminds.ebrains.eu/vocab/studiedState", 144 144 "required": true, 145 - "structure": { 146 - "propertyName": "query:ageCategory", 147 - "path": "https://openminds.ebrains.eu/vocab/ageCategory", 148 - "required": true, 149 - "structure": { 150 - "propertyName": "query:name", 151 - "path": "https://openminds.ebrains.eu/vocab/name", 152 - "required": true, 153 - "filter": { 154 - "op": "EQUALS", 155 - "value": "adult" 156 - } 157 - } 158 - } 170 + "filter": { 171 + "op": "EQUALS", 172 + "value": "adult" 173 + }, 174 + "path": [ 175 + "https://openminds.ebrains.eu/vocab/studiedState", 176 + "https://openminds.ebrains.eu/vocab/ageCategory", 177 + "https://openminds.ebrains.eu/vocab/name" 178 + ] 159 159 } 160 160 ] 161 161 } ... ... @@ -177,9 +177,11 @@ 177 177 178 178 To find datasets with a particular file format in it, we can either write a query for 1) the file extension or 2) based on the content type. The difference between the two approaches is that the first approach just looks at the file extension without considering the what type of file format it is and what software can be used to open it. For example, both nifti 1 and nifti 2 files have the same extension. The nifti 2 format is an update of nifti 1 and will not be recognised as a valid nifti 1 format. This is important when considering what program to use when opening the files. To be able to differentiate, we describe the files with content types that tell the user what type of file format it is and we have linked a number of software applications to that content type to facilitate reuse of the data. 179 179 200 +For these examples, we are showing the unflattened version of the query. Try it yourself to create a flattened version! 201 + 180 180 **Query datasets based on file extension** 181 181 182 -We can restrict the search results with a filter using a required field. In this particular case a filter that "ENDS_WITH" a value (e.g. .nii.gz) could be used. We can use .nii for normal nifti files or .nii.gz for compressed nift yfiles.204 +We can restrict the search results with a filter using a required field. In this particular case a filter that "ENDS_WITH" a value (e.g. .nii.gz) could be used. We can use .nii for normal nifti files or .nii.gz for compressed nifti files. 183 183 184 184 {{code language="json" layout="LINENUMBERS"}} 185 185 { ... ... @@ -297,12 +297,21 @@ 297 297 298 298 ==== **Search UI** ==== 299 299 322 +To find software that can open a particular file format like the Spike2 file format (.smr), we can select the category "software" and then filter based on "input format". We select "application/vnd.spike2.sonpy.son" to ensure we only get software for this file format. 300 300 324 +[[[[image:SoftwareSearch.png||alt="Software Search"]]>>https://search.kg.ebrains.eu/?category=Software&inputFormats[0]=application%2Fvnd.spike2.sonpy.so]] 325 + 301 301 [[https:~~/~~/search.kg.ebrains.eu/?category=Software&inputFormats[0]=application%2Fvnd.spike2.sonpy.son>>https://search.kg.ebrains.eu/?category=Software&inputFormats[0]=application%2Fvnd.spike2.sonpy.son]] 302 302 303 303 304 304 ==== **Query Builder** ==== 305 305 331 +For this question, we will execute the query against the "softwareVersion data structure. We ask for the name, version and input type of the software. We further refine our query by restricting the result to software that can open files with the file extension ".smr". We get the same 3 software types as in the search, and we immediately see that one of the software types has multiple versions that can open this kind of files. 332 + 333 +(This query is not flattened). 334 + 335 +You may have noticed that even though you can get the same results in the Search UI and the Query Builder, the details you need to define to get to that result are not always the same. For example, in the query we can rely on the file extension (i.e. ".smr"), whereas in the Search UI we sometimes need to have specific details about the file you are trying to open, such as the software/data aquisition/analysis package (i.e. Spike2) that was used. You may keep this is mind when tackling a particular question. Queries can be made as detailed and broad as you want, whereas the Search UI is a snapshot of the most common and basic information that is available. 336 + 306 306 {{code language="json" layout="LINENUMBERS"}} 307 307 { 308 308 "@context": {
- HumanData.png
-
- Author
-
... ... @@ -1,0 +1,1 @@ 1 +XWiki.maaike - Size
-
... ... @@ -1,0 +1,1 @@ 1 +400.8 KB - Content