Wiki source code of Technical details

Version 66.1 by lzehl on 2021/06/24 16:07

Show last authors
1 (% class="box infomessage" %)
2 (((
3 (% style="text-align: justify;" %)
4 openMINDS is designed as modular as possible, in order to facilitate extensions and maintenance of existing, as well as development and integration of new metadata models and schemas. The layout and technical requirements for this modularity are described below.
5
6 (% style="text-align: justify;" %)
7 In parallel, openMINDS tries to consider the various programming skills present in the neuroscience research community. For this reason, openMINDS established an integration pipeline which gradually increases the level of technical detail: starting from a user-friendly, lightweight schema template and ending with established, highly technical metadata schema formats (e.g., JSON-Schema).
8
9 (% style="text-align: justify;" %)
10 Please find below a documentation of the layout and requirements needed to keep the openMINDS modularity, the syntax of the openMINDS schema template, as well as the openMINDS integration pipeline.
11 )))
12
13 === The openMINDS umbrella ===
14
15 (% style="text-align: justify;" %)
16 In summary, openMINDS is the overall umbrella for a set of distributed GitHub repositories, each defining a particular metadata model for neuroscience research products.
17
18 (% style="text-align: justify;" %)
19 The main (or central) [[openMINDS GitHub repository>>https://github.com/HumanBrainProject/openMINDS||rel="noopener noreferrer" target="_blank"]] ingests all these GitHub repositories as [[git-submodules>>https://git-scm.com/docs/git-submodule||rel="noopener noreferrer" target="_blank"]]. Furthermore it stores the openMINDS vocabulary (**##vocab##**), providing general definitions and references for **types** and **properties** used in schemas across all openMINDS repositories (cf. below). And last but not least, it holds the schema representations for all supported metadata formats created by the openMINDS integration pipeline (cf. below).
20
21 (% style="text-align: justify;" %)
22 For this to work smoothly for the existing, but also for all new openMINDS metadata models, the corresponding openMINDS submodules (GitHub repositories) have to meet the following requirements:
23
24 (% style="text-align: justify;" %)
25 **(1)** The openMINDS metadata model has to be located on a **public GitHub repository** and published under an **MIT license**.
26
27 (% style="text-align: justify;" %)
28 **(2)** The GitHub repository should have at least one **version branch** (e.g., "v1").
29
30 (% style="text-align: justify;" %)
31 **(3)** The version branch should have the following **main directory folders**: **##schemas##** (required), **##tests##** (recommended),  **##examples##** (recommended), and **##img##** (optional).
32
33 (% style="text-align: justify;" %)
34 **(4)** The **##schemas##** folder should contain the schemas of that metadata model implemented in the **openMINDS schema template syntax** (cf. below). The directory of the schemas can be further structured or flat.
35
36 (% style="text-align: justify;" %)
37 **(5)** The **##tests##** folder should contain test-instances (JSON-LDs) for the schemas in a flat directory. The file names for these test-instances should follow the convention of
38
39 (% style="text-align: center;" %)
40 **##<<XXX>>-<<YYY>>.jsonld##**
41
42 (% style="text-align: justify;" %)
43 for files that should pass the tests, and
44
45 (% style="text-align: center;" %)
46 **##<<XXX>>-<<YYY>>-nok.jsonld##**
47
48 (% style="text-align: justify;" %)
49 for files that should fail the test. In both cases, **##<<XXX>>##** should be replaced with the label of the schema that is tested, and **##<<YYY>>##** with a user defined label for what aspect is tested (e.g., **##person-withoutCI.jsonld##**).
50
51 (% style="text-align: justify;" %)
52 **(6)** The **##examples##** folder should contain examples for valid instance collections for that metadata model. Each example should receive its own directory (folder) with a **##README.md##** describing the example, and an **##metadataCollection##** subfolder containing the openMINDS instances (JSON-LDs). This subfolder can be further structured or flat.
53
54 (% style="text-align: justify;" %)
55 **(7)** The **##img##** folder should contain image files used on that GitHub repository (e.g., the logo of the new openMINDS metadata model). The directory of the images can be further structured or flat.
56
57 === The openMINDS vocabulary ===
58
59 (% style="text-align: justify;" %)
60 Through the integration pipeline of the openMINDS generator, the openMINDS vocabulary is automatically gathered and stored in the main openMINDS GitHub in order to centrally maintain general definitions and references for **types** and **properties** used in schemas across all openMINDS repositories. How this works is explained in the following.
61
62 (% style="text-align: justify;" %)
63 Schema types and properties are stored in dedicated JSON files (**##types.json##** and **##properties.json##**) under the folder **##vocab##** located in the main openMINDS GitHub directory. Each schema type and property occurring in the openMINDS metadata models is automatically represented in those files as nested dictionaries. Here a cutout of the **##types.json##**:
64
65 {{code language="json"}}
66 {
67 ...,
68 "https://openminds.ebrains.eu/core/Person": {
69 "description": "Structured information on a person (alive or dead).",
70 "name": "Person",
71 "translatableTo": [
72 "https://schema.org/Person"
73 ]
74 },
75 ...
76 }
77 {{/code}}
78
79 ... and a cutout of the **##properties.json##**:
80
81 {{code language="json"}}
82 {
83 ...,
84 "givenName": {
85 "description": "Name given to a person, including all potential middle names, but excluding the family name.",
86 "name": "Given name",
87 "nameForReverseLink": "Is given name of",
88 "sameAs": [
89 "https://schema.org/givenName"
90 ],
91 "schemas": [
92 "core/v3/actors/person.schema.tpl.json"
93 ]
94 },
95 ...
96 }
97 {{/code}}
98
99 (% style="text-align: justify;" %)
100 The keywords of those nested dictionaries are pre-defined to consistently capture for all schema types and properties their namespace, their occurrence (cf. **##"schemas"##** in **##properties.json##**), their general description (cf. **##"description"##** in **##types.json##** and **##properties.json##**), and possible references to related or matching schema types (cf. **##"translatableTo"##** in **##types.json##**) and properties (cf. **##""sameAs""##** in **##properties.json##**) of other metadata initiatives (e.g., schema.org).
101
102 (% style="text-align: justify;" %)
103 This setup also allows us to define some values/entries to be automatically filled in by the openMINDS integration pipeline with each commit to one of the openMINDS repositories (**##"name"##**, **##"schemas"##**) and others to be manually editable later on (**##"description"##**, **##"translatableTo"##**, **##"sameAs"##**, **##"nameForReverseLink"##**).
104
105 (% style="text-align: justify;" %)
106 For security, outdated entries in those openMINDS vocabulary files (e.g., because the namespace of the schema type or property changed or the schema type or property was deleted) are not automatically deleted, but kept and marked as being deprecated. After evaluation, deprecated schema types or properties can be deleted manually from openMINDS vocabulary.
107
108 (% style="text-align: justify;" %)
109 With that, the openMINDS vocab reflects always an up-to-date status of the schema types and properties in use across all openMINDS metadata models, while providing the opportunity to centrally review and maintain their consistency and references.
110
111 === The openMINDS schema template syntax ===
112
113 (% style="text-align: justify;" %)
114 All openMINDS metadata models use a light-weighted schema template syntax for defining the expected metadata. The correspondingly formatted schema files use the extension: **##.schema.tpl.json##**.
115
116 (% style="text-align: justify;" %)
117 Although, as the file extension suggests, this openMINDS schema template syntax is inspired by JSON-Schema, it facilitates or even excludes technical aspects making the openMINDS schemas more human-readable, especially for untrained eyes. Behind the scenes, within the openMINDS integration pipeline (cf. below), this schema template syntax is then interpreted and flexibly translated to various formal metadata formats (e.g., JSON-Schema).
118
119 (% style="text-align: justify;" %)
120 Despite the simplification in comparison to JSON-Schema, the openMINDS schema templates are also, at the core, specially formatted JSON files using a particular syntax, meaning special key-value pairs that define the validation rules of a schema. Please find in the following a full documentation of the openMINDS schema template syntax and how it's key-value pairs need to be defined and interpreted.
121
122 (% style="text-align: justify;" %)
123 For the more inexperienced programmers, let's start by explaining first some general terms that will be later used in the openMINDS schema template syntax specification. More experienced programmers can of course skip these explanations and jump directly further down to Specifications.
124
125 **What are strings, integers, floats or booleans?** Generally speaking strings, integers, floats or booleans are derived **data types**. A **string** is defined as a sequence of characters between quotes (e.g., ##"Is this a string? YES!"## or ##'thisIsAlsoAString'##). For a string, openMINDS accepts Unicode characters. An **integer** is a whole number, positive or negative, without decimals, of unlimited length (e.g., ##5##, ##-5## or ##1238921234##). A **float** represents a real number, written with a decimal point dividing the integer and fractional part, both of unlimited length (e.g., ##5.15##, ##-5.15## or ##1238921234.1345##). A **boolean** represents a logical proposition by means of the binary digits ##0## (##false##) and ##1## (##true##), especially in computing and electronics. How a boolean is written depends highly on the format or computational language.
126
127 **What is a list or array?** A **list** is a data structure that is a mutable ordered sequence of values (also called items). The values of a list are typically defined between square brackets (e.g., ##[value1, value2, value3]##). Note that the values within a list do not have to have the same data type. In contrast, an **array** is a data structure that is a mutable unordered sequence of values of the same data type. What data types are accepted for values in a list or in an array is highly depending on the format or computational language.
128
129 **What is a key-value pair or an associative array?** A **key-value pair** (sometimes also called name-value pair, attribute-value pair, property-value pair, or field-value pair) is a basic data representation and standard language feature in computing languages, systems and applications. In most cases this concept is used to build an **associative array** (also called **dictionary**), meaning an unordered list of unique keys with associated values typically defined within curly brackets (e.g., ##{key1: value1, key3: value3, key2:value2}##). What data types are accepted for keys and values highly depends on the format or computational language. Note that a value could also be a data structure, such as a list, an array or an associative array.
130
131 **What is JSON?** JSON is short for **J**ava**S**cript **O**bject **N**otation, a lightweight data-interchange format which is built on associative arrays with key-value pairs and lists. Each JSON document/file begins as associative array. The keys are separated from the values via a colon and key-value pairs are separated by a comma. While a key always has to be a string in double quotes, a value can be a string in double quotes, an integer, a float, a boolean (written as true or false), null, a list or an associative array. Nesting of these structures is unlimited. For more information please go to the official webpage: [[https:~~/~~/www.json.org/>>https://www.json.org/]]. Several serialisation formats have been built on the JSON specification, such as JSON-LD (cf.[[ Application details: JSON-LD - the openMINDS serialization format>>doc:Collabs.openminds.Documentation.Application details.WebHome||target="_blank"]]). In addition, several schema languages have been developed to annotate and validate JSON documents, such as JSON-Schema and SHACL (cf. The openMINDS integration pipeline).
132
133 ==== Specifications ====
134
135 (coming soon)
136
137 === The openMINDS integration pipeline ===
138
139 (//**coming soon**//) If you'd like to learn more about the openMINDS integration pipeline, especially if you'd like to contribute to it, please get in touch with us (the openMINDS development team) via the issues on the openMINDS or openMINDS_generator GitHub or the support email: openminds@ebrains.eu
140
141 {{putFootnotes/}}
Public

openMINDS