Wiki source code of Storing data in user space

Version 14.1 by mmorgan on 2020/07/17 02:26

Show last authors
1 This page describes a workflow that you can follow to use the Collaboratory.drive as a back-end for your service to store and read data inside a private user space.
2
3 == Overview ==
4
5 Your OIDC client can be setup to have a service account linked to it. This service account being seen as a user by Keycloak, it can log in the Collaboratory.drive to have its user account synchronised there.
6
7 From this point, everything is set up to let your service account create and share files and folders with existing users. This can be achieved by using the existing Seafile API (the tool behind the Collaboratory.drive).
8
9 == Creating a service account in IAM ==
10
11 If you do not yet have an OIDC client for your service, see [[Registering an OIDC client>>doc:Collabs.collaboratory-community-apps.Community App Developer Guide.1\. Registering an OIDC client.WebHome]].
12
13 Once you have an OIDC account, you need to [[modify your client>>doc:Collabs.collaboratory-community-apps.Community App Developer Guide.WebHome||anchor="HModifyingyourclient"]] to allow service accounts:
14
15 {{code language="bash"}}
16 # Set your registration token and client id
17 clb_reg_token="..."
18 clb_client_id="my-awesome-client"
19
20 # Update the client. Note that the client ID appears both in the endpoint URL and in the body of the request.
21 curl -X PUT https://iam.ebrains.eu/auth/realms/hbp/clients-registrations/default/${clb_client_id} \
22 -H "Authorization: Bearer ${clb_reg_token}" \
23 -H 'Content-Type: application/json' \
24 -d '{
25 "clientId": "'${clb_client_id}'",
26 "serviceAccountsEnabled": true
27 }' |
28
29 # Pretty print the JSON response
30 json_pp
31
32 {{/code}}
33
34 == Creating the service account in the Drive ==
35
36 Once you have created the service account in IAM, the service account needs to be created in the Drive. This step requires IAM admin privileges so you cannot do it yourself. Send a request to [[support@ebrains.eu>>mailto:support@ebrains.eu]] instead and make sure to include the following information:
37
38 1. The URL of this page
39 1. Your client ID: e.g. "my-awesome-client"
40
41 Never share tokens or secrets.
42
43
44 The steps for the admins are the following:
45
46 1. get the service account's sub
47 1. enable the service account user in IAM
48 1. create the service account in the Drive service
49
50 === Getting the service account's sub ===
51
52 One way to get the service account is to request a token with its credentials. Get the client ID from the OIDC client owner. Then get the client secret from IAM.
53
54 {{code language="bash"}}
55 # Set the client id and secret
56 clb_client_id=...
57 clb_client_secret=...
58
59 # Call the token endpoint
60 curl -X POST https://iam.ebrains.eu/auth/realms/hbp/protocol/openid-connect/token \
61 -d 'grant_type=client_credentials' \
62 -d "client_id=${clb_client_id}" \
63 -d "client_secret=${clb_client_secret}" |
64
65 # Pretty print the JSON response
66 json_pp
67 {{/code}}
68
69 Fetch the access token from the response and use a tool to decode its payload. An easy way to decide the payload is by using the online service at [[https://jwt.io/]]. Copy the sub from the payload.
70
71 === Enabling the service account user ===
72
73 Navigate to https://iam.ebrains.eu/auth/admin/master/console/#/realms/hbp/users/$sub (replacing $sub with the value you got at the previous step).
74
75 Set the "Email Verified" value to "On" and remove any "Required User Actions".
76
77 === Creating the service account in the Drive ===
78
79 You can now impersonate the service account and log in to the Collaboratory.drive, which will create the user account for the service account.
80
81 == Creating the user space ==
82
83 Once support confirms that the service account has been activated in the Drive, you can proceed by creating the user space in the Drive. The user space will be created in the Drive, in the default Library of the service account. Each user of your service will have its own user space.
84
85 The general process is the following:
86
87 1. Fetch a token for your service account to be able to discuss with the Drive API.
88 1. Your service needs to get its default library id. It is where it will create the users' spaces.
89 1. For a given user, your service should create a folder, using a unique identifier (either the sub or username).
90
91 From this point, your service can now store and read data linked to users' accounts.
92
93 If your service needs the data to be available in Jupyter Notebooks, it will need to share it with the user:
94
95 1. Inside the user folder, your service should create a folder with a name that would be common across users.
96 1. Your service should now share the inside folder with the common name with the individual user.
97
98 This way, notebooks will be able to refer to your service data with a common path for all users.
99
100 === Getting an API token ===
101
102 The first step is to fetch an access token for your service account. This can be done with the following request:
103
104 {{code language="bash"}}
105 # Set the client id and secret
106 clb_client_id=...
107 clb_client_secret=...
108
109 # Call the token endpoint
110 curl -X POST https://iam.ebrains.eu/auth/realms/hbp/protocol/openid-connect/token \
111 -d 'grant_type=client_credentials' \
112 -d "client_id=${clb_client_id}" \
113 -d "client_secret=${clb_client_secret}" \
114 -d "scope=openid email collab.drive" |
115
116 # Pretty print the JSON response
117 json_pp
118 {{/code}}
119
120 The response will be similar to:
121
122 {{code language="javascript"}}
123 {
124 "expires_in" : 108000,
125 "not-before-policy" : 1563261088,
126 "access_token" : "eyJhbGciOiJSU...",
127 "session_state" : "4882dbae-56dc-4a91-b8ae-8ad07117d4af",
128 "refresh_expires_in" : 14400,
129 "refresh_token" : "eyJhbGciOiJIUz...",
130 "token_type" : "bearer",
131 "scope" : "openid email collab.drive"
132 }
133 {{/code}}
134
135 Fetch the access token from this response.
136
137 The next step is to get a Drive API token with the access token:
138
139 {{code language="bash"}}
140 # Set the access token value
141 clb_access_token=...
142
143 # Call the token endpoint
144 curl -X GET https://drive.ebrains.eu/api2/account/token/ \
145 -H "Authorization: Bearer ${clb_access_token}"
146 {{/code}}
147
148 The response will look like (% class="mark" %)##1c1345da8a99b36168afef92ef7f83af8b4ca6f0##(%%). This is the API token that you will need to use in your ##Authorization## header to communicate with the Collaboratory.drive API.
149
150 {{warning}}
151 Note that, unlike access tokens , the Drive API token needs an authorisation header in the form of "##Authorization: **Token** your-api-token"## (and not Bearer!).
152 {{/warning}}
153
154 === Fetching your service account's default library ===
155
156 Note: from this point, you can refer to the [[documentation of Seafile>>https://download.seafile.com/published/web-api/v2.1-admin/]] to make API calls to the Collaboratory.drive.
157
158 {{code language="bash"}}
159 # Set the API token
160 clb_api_token=...
161
162 # Call the default-repo endpoint
163 curl -X GET https://drive.ebrains.eu/api2/default-repo/ \
164 -H "Authorization: Token ${clb_api_token}" |
165
166 # Pretty print the JSON response
167 json_pp
168 {{/code}}
169
170 You will get a response in the form of:
171
172 {{code language="javascript"}}
173 {
174 "repo_id": "175a7b2e-f9f5-4f3c-a9e9-84c4e995f1ea",
175 "exists": true
176 }
177 {{/code}}
178
179 The `repo_id` is the identifier of the default library of your service account.
180
181 === Creating a folder for a given user ===
182
183 In order to isolate the data of individual users, you should create a folder for each user, based on a unique identifier of the user. The most secure identifier you can use is the `sub` as it is an internal unique identifier of IAM.
184
185 The `username` is also a valid unique identifier but, in some rare cases, they might be claimed by a different user in the future.
186
187 Creating a folder is done with the following call:
188
189 {{code language="bash"}}
190 # Set the API token, your library id and the folder path
191 clb_api_token=...
192 clb_repo_id=...
193 clb_folder_path=/my/user/folder/path
194
195 # Call the default-repo endpoint
196 curl -X POST \
197 "https://drive.ebrains.eu/api2/repos/${clb_repo_id}/dir/" \
198 -H "Authorization: Token ${clb_api_token}" \
199 -d "p=${clb_folder_path}"
200 -F 'operation=mkdir' |
201
202 # Pretty print the JSON response
203 json_pp
204 {{/code}}
205
206 Please note that you cannot create a path of folders all at once. You will need to make one call at each level of your path.
207
208 === Writing data to the user space ===
209
210 TODO
211
212 === Fetching data from the user space ===
213
214 TODO
215
216 == Sharing data with notebooks ==
217
218 The Collaboratory.drive is mounted in the Jupyter container of the user. This means that notebooks can access data stored in the Collaboratory.drive of the user.
219
220 Let's imagine your service generates data that should be ingested by a notebook. In order for the notebook to be usable by any user, the path to the data must be the same for each user.
221
222 When sharing a folder ##/path/to/the/shared_folder## with a user, the folder becomes reachable at the path ##drive/share with me/shared_folder##. As you can see, only the name of the folder is shared with the user.
223
224 Imagine you want to generate data per user and you want notebooks to refer to this data in a folder named ##//my-awesome-client//-data##. You will need to create a folder in your service account's default library in the following form:
225
226 ##/path/to/user/spaces/in/your/library/**${user-id}**///my-awesome-client//-data##.
227
228 Make sure the folder starts with your OIDC client name or some other unique name. Once the folder is created, you will need to share it with the user.
229
230 === Sharing a folder with a user ===
231
232 {{code language="bash"}}
233 # Set the parameters
234 clb_api_token=...
235 clb_repo_id=...
236 clb_username=...
237 clb_folder_path="/path/to/user/spaces/in/your/library/${clb_username}/my-awesome-client-data"
238 clb_permission=rw # r=read, w=write
239
240 # Call the default-repo endpoint
241 curl -X PUT "https://drive.ebrains.eu/api2/repos/${clb_repo_id}/dir/shared_items/" \
242 -H "Authorization: Token ${clb_api_token}" \
243 -d "p=${clb_folder_path}"
244 -F 'share_type=user' \
245 -F "permission=${clb_permission}" \
246 -F "username=${clb_username}@humanbrainproject.eu"
247 {{/code}}
248
249 You will get a response in the form of:
250
251 {{code language="javascript"}}
252 // TODO
253 {{/code}}