Wiki source code of Storing data in user space

Version 13.2 by allan on 2020/03/10 13:51

Show last authors
1 This article describes a workflow that you can follow to use the Collaboratory.drive as a backend for your service to be able to store and read data inside a private user space.
2
3 == Solution description ==
4
5 Your Keycloak client can be setup to have a service account linked to it. This service account being seen as a user by Keycloak, it can log in the Collaboratory.drive to have its user account synchronised there.
6
7 From this point, everything is set up to let your service account create and share files and folders to existing users. This can be achieved by using the existing Seafile API (the tool behind the Collaboratory.drive).
8
9 == Creating a service account ==
10
11 If needed, follow the [[guide to create an OpenID Connect client>>doc:Collabs.collaboratory-community-apps.Community App Developer Guide.WebHome||anchor="HCreatingyourOpenIDConnectclient"]].
12
13 You will need to [[modify your client>>doc:Collabs.collaboratory-community-apps.Community App Developer Guide.WebHome||anchor="HModifyingyourclient"]] to allow service accounts:
14
15 {{code language="bash"}}
16 # Set your registration token
17 clb_reg_token=...
18
19 # Update the client
20 curl -X PUT https://iam.ebrains.eu/auth/realms/hbp/clients-registrations/default/my-awesome-client \
21 -H "Authorization: Bearer ${clb_reg_token}" \
22 -H 'Content-Type: application/json' \
23 -d '{
24 "clientId": "my-awesome-client",
25 "serviceAccountsEnabled": true
26 }' |
27
28 # Prettify the JSON response
29 json_pp;
30 {{/code}}
31
32 == Creating a user account for the service account in the Collaboratory.drive ==
33
34 This step requires admin privileges. Please send a request to [[support@humanbrainproject.eu>>mailto:support@humanbrainproject.eu]] in order to get help.
35
36 The steps for the admins are the following:
37
38 1. get the service account sub
39 1. enable the service account user
40 1. impersonate the service account
41 1. log in Collaboratory.drive
42
43 === Getting the service account sub ===
44
45 One way to get the service account is to request a token with its credentials.
46
47 {{code language="bash"}}
48 # Set the client id and secret
49 clb_client_id=...
50 clb_client_secret=...
51
52 # Call the token endpoint
53 curl -X POST https://iam-dev.ebrains.eu/auth/realms/hbp/protocol/openid-connect/token \
54 -d 'grant_type=client_credentials' \
55 -d "client_id=${clb_client_id}" \
56 -d "client_secret=${clb_client_secret}" |
57
58 # Prettify the JSON response
59 json_pp;
60 {{/code}}
61
62 Fetch the access token from the response and use a tool to decode its payload. [[https://jwt.io/]] is one option. Copy the sub from the payload.
63
64 === Enabling the service account user ===
65
66 Navigate to https://iam.ebrains.eu/auth/admin/master/console/#/realms/hbp/users/$sub (replacing $sub with the value you got at the previous step).
67
68 Set the "Email Verified" value to "On" and remove any "Required User Actions".
69
70 You can now impersonate the user and log in the Collaboratory.drive, which will create the user account for the service account.
71
72 == Creating the user space ==
73
74 The general process is the following:
75
76 1. Fetch a token for your service account to be able to discuss with the drive API.
77 1. Your service needs to get its default library id. It is where it will create the users' spaces.
78 1. For a given user, your service should create a folder, using a unique identifier (either the sub, username or email).
79
80 From this point, your service can now store and read data linked to users accounts.
81
82 If your service needs the data to be available in notebooks, it will need to share it with the user:
83
84 1. Inside the user folder, your service should create a folder with a name that would be common across users.
85 1. Your service should now share the inside folder with the user.
86
87 This way, notebooks will be able to refer to your service data with a common path for all users.
88
89 === Getting an API token ===
90
91 The first step is to fetch an access token for your service account. This can be done with the following request:
92
93 {{code language="bash"}}
94 # Set the client id and secret
95 clb_client_id=...
96 clb_client_secret=...
97
98 # Call the token endpoint
99 curl -X POST https://iam-dev.ebrains.eu/auth/realms/hbp/protocol/openid-connect/token \
100 -d 'grant_type=client_credentials' \
101 -d "client_id=${clb_client_id}" \
102 -d "client_secret=${clb_client_secret}" \
103 -d "scope=openid email collab.drive" |
104
105 # Prettify the JSON response
106 json_pp;
107 {{/code}}
108
109 The response will be similar to:
110
111 {{code language="javascript"}}
112 {
113 "expires_in" : 108000,
114 "not-before-policy" : 1563261088,
115 "access_token" : "eyJhbGciOiJSU...",
116 "session_state" : "4882dbae-56dc-4a91-b8ae-8ad07117d4af",
117 "refresh_expires_in" : 14400,
118 "refresh_token" : "eyJhbGciOiJIUz...",
119 "token_type" : "bearer",
120 "scope" : "openid email collab.drive"
121 }
122 {{/code}}
123
124 Fetch the access token from this response.
125
126 The next step is to get an API token with the access token:
127
128 {{code language="bash"}}
129 # Set the access token value
130 clb_access_token=...
131
132 # Call the token endpoint
133 curl -X GET https://drive.ebrains.eu/api2/account/token/ \
134 -H "Authorization: Bearer ${clb_access_token}"
135 {{/code}}
136
137 The response will look like (% class="mark" %)##1c1345da8a99b36168afef92ef7f83af8b4ca6f0##(%%). This is the API token that you will need to use in your ##Authorization## header to discuss with the Collaboratory.drive API.
138
139 {{warning}}
140 Note that, unlike access tokens , the API token needs an authorisation header in the form of "##Authorization: **Token** your-api-token"## (and not Bearer!).
141 {{/warning}}
142
143 === Fetching your service account default library ===
144
145 Note: from this point, you can refer to the [[documentation of Seafile>>https://download.seafile.com/published/web-api/v2.1-admin/]] to make API calls to the Collaboratory.drive.
146
147 {{code language="bash"}}
148 # Set the API token
149 clb_api_token=...
150
151 # Call the default-repo endpoint
152 curl -X GET https://drive.ebrains.eu/api2/default-repo/ \
153 -H "Authorization: Token ${clb_api_token}" |
154
155 # Prettify the JSON response
156 json_pp;
157 {{/code}}
158
159 You will get a response in the form of:
160
161 {{code language="javascript"}}
162 {
163 "repo_id": "175a7b2e-f9f5-4f3c-a9e9-84c4e995f1ea",
164 "exists": true
165 }
166 {{/code}}
167
168 The `repo_id` is the identifier of the default library of your service account.
169
170 === Creating a folder for a given user ===
171
172 In order to separate your users data, you should create a folder for each user, based on a unique identifier of the user. The most secure identifier you can use is the `sub` as it is an internal unique identifier of IAM.
173
174 The `username` or `email` are also valid unique identifier but, in some rare cases, they might be claimed by a different user in the future.
175
176 Creating a folder is done with the following call:
177
178 {{code language="bash"}}
179 # Set the API token, your library id and the folder path
180 clb_api_token=...
181 clb_repo_id=...
182 clb_folder_path=/my/user/folder/path
183
184 # Call the default-repo endpoint
185 curl -X POST \
186 "https://drive.ebrains.eu/api2/repos/${clb_repo_id}/dir/" \
187 -H "Authorization: Token ${clb_api_token}" \
188 -d "p=${clb_folder_path}"
189 -F 'operation=mkdir' |
190
191 # Prettify the JSON response
192 json_pp;
193 {{/code}}
194
195 Please note that you cannot create a path of folders all at once. You will need to make one call at each level of your path.
196
197 === Writing data in the user space ===
198
199 TODO
200
201 === Fetching data from the user space ===
202
203 TODO
204
205 == Sharing data with notebooks ==
206
207 The Collaboratory.drive is mounted in the Jupyter container of the user. This means that notebooks can access data stored in the Collaboratory.drive of the user.
208
209 Let's imagine your service generates data that should be ingested by a notebook. In order for the notebook to be usable by any user, the path to the data must be the same for each user.
210
211 When sharing a folder ##/path/to/the/shared_folder## to a user, the folder gets reachable at the path ##drive/share with me/shared_folder##. As you can see, only the name of the folder is shared with the user.
212
213 Imagine you want to generate data per user and you want notebooks to refer to this data in a folder named ##my-app-data##. You will need to create a folder in your service account default library in the following form: ##/path/to/user/spaces/in/your/library/**${user-id}**/my-app-data##.
214
215 Once the folder is created, you will need to share it with the user.
216
217 === Sharing a folder with a user ===
218
219 {{code language="bash"}}
220 # Set the parameters
221 clb_api_token=...
222 clb_repo_id=...
223 clb_username=...
224 clb_folder_path="/path/to/user/spaces/in/your/library/${clb_username}/my-app-data"
225 clb_permission=rw # r=read, w=write
226
227 # Call the default-repo endpoint
228 curl -X PUT "https://drive.ebrains.eu/api2/repos/${clb_repo_id}/dir/shared_items/" \
229 -H "Authorization: Token ${clb_api_token}" \
230 -d "p=${clb_folder_path}"
231 -F 'share_type=user' \
232 -F "permission=${clb_permission}" \
233 -F "username=${clb_username}@humanbrainproject.eu"
234 {{/code}}
235
236 You will get a response in the form of:
237
238 {{code language="javascript"}}
239 // TODO
240 {{/code}}