Wiki source code of Storing data in user space
Last modified by messines on 2021/09/29 14:32
Show last authors
author | version | line-number | content |
---|---|---|---|
1 | This page describes a workflow that you can follow to use the Collaboratory.drive as a back-end for your service to store and read data inside a private user space. | ||
2 | |||
3 | == Overview == | ||
4 | |||
5 | Your OIDC client can be setup to have a service account linked to it. This service account being seen as a user by Keycloak, it can log in the Collaboratory.drive to have its user account synchronised there. | ||
6 | |||
7 | From this point, everything is set up to let your service account create and share files and folders with existing users. This can be achieved by using the existing Seafile API (the tool behind the Collaboratory.drive). | ||
8 | |||
9 | == Creating a service account in IAM == | ||
10 | |||
11 | If you do not yet have an OIDC client for your service, see [[Registering an OIDC client>>doc:Collabs.collaboratory-community-apps.Community App Developer Guide.1\. Registering an OIDC client.WebHome]]. | ||
12 | |||
13 | Once you have an OIDC account, you need to [[modify your client>>doc:Collabs.collaboratory-community-apps.Community App Developer Guide.WebHome||anchor="HModifyingyourclient"]] to allow service accounts: | ||
14 | |||
15 | {{code language="bash"}} | ||
16 | # Set your registration token and client id | ||
17 | clb_reg_token="..." | ||
18 | clb_client_id="my-awesome-client" | ||
19 | |||
20 | # Update the client. Note that the client ID appears both in the endpoint URL and in the body of the request. | ||
21 | curl -X PUT https://iam.ebrains.eu/auth/realms/hbp/clients-registrations/default/${clb_client_id} \ | ||
22 | -H "Authorization: Bearer ${clb_reg_token}" \ | ||
23 | -H 'Content-Type: application/json' \ | ||
24 | -d '{ | ||
25 | "clientId": "'${clb_client_id}'", | ||
26 | "serviceAccountsEnabled": true | ||
27 | }' | | ||
28 | |||
29 | # Pretty print the JSON response | ||
30 | json_pp | ||
31 | |||
32 | {{/code}} | ||
33 | |||
34 | == Creating the service account in the Drive == | ||
35 | |||
36 | Once you have created the service account in IAM, the service account needs to be created in the Drive. This step requires IAM admin privileges so you cannot do it yourself. Send a request to [[support@ebrains.eu>>mailto:support@ebrains.eu]] instead and make sure to include the following information: | ||
37 | |||
38 | 1. The URL of this page: [[Create an IAM service account>>https://wiki.ebrains.eu/bin/view/Collabs/collab-devs/How%20To/Create%20an%20IAM%20service%20account/]] | ||
39 | 1. Your client ID: e.g. "my-awesome-client" | ||
40 | |||
41 | Never share tokens or secrets. | ||
42 | |||
43 | == Creating the user space == | ||
44 | |||
45 | Once support confirms that the service account has been activated in the Drive, you can proceed by creating the user space in the Drive. The user space will be created in the Drive, in the default Library of the service account. Each user of your service will have its own user space. | ||
46 | |||
47 | The general process is the following: | ||
48 | |||
49 | 1. Fetch a token for your service account to be able to discuss with the Drive API. | ||
50 | 1. Your service needs to get its default library id. It is where it will create the users' spaces. | ||
51 | 1. For a given user, your service should create a folder, using a unique identifier (either the sub or username). | ||
52 | |||
53 | From this point, your service can now store and read data linked to users' accounts. | ||
54 | |||
55 | If your service needs the data to be available in Jupyter Notebooks, it will need to share it with the user: | ||
56 | |||
57 | 1. Inside the user folder, your service should create a folder with a name that would be common across users. | ||
58 | 1. Your service should now share the inside folder with the common name with the individual user. | ||
59 | |||
60 | This way, notebooks will be able to refer to your service data with a common path for all users. | ||
61 | |||
62 | === Getting an API token === | ||
63 | |||
64 | The first step is to fetch an access token for your service account. This can be done with the following request: | ||
65 | |||
66 | {{code language="bash"}} | ||
67 | # Set the client id and secret | ||
68 | clb_client_id=... | ||
69 | clb_client_secret=... | ||
70 | |||
71 | # Call the token endpoint | ||
72 | curl -X POST https://iam.ebrains.eu/auth/realms/hbp/protocol/openid-connect/token \ | ||
73 | -d 'grant_type=client_credentials' \ | ||
74 | -d "client_id=${clb_client_id}" \ | ||
75 | -d "client_secret=${clb_client_secret}" \ | ||
76 | -d "scope=openid email collab.drive" | | ||
77 | |||
78 | # Pretty print the JSON response | ||
79 | json_pp | ||
80 | {{/code}} | ||
81 | |||
82 | The response will be similar to: | ||
83 | |||
84 | {{code language="javascript"}} | ||
85 | { | ||
86 | "expires_in" : 108000, | ||
87 | "not-before-policy" : 1563261088, | ||
88 | "access_token" : "eyJhbGciOiJSU...", | ||
89 | "session_state" : "4882dbae-56dc-4a91-b8ae-8ad07117d4af", | ||
90 | "refresh_expires_in" : 14400, | ||
91 | "refresh_token" : "eyJhbGciOiJIUz...", | ||
92 | "token_type" : "bearer", | ||
93 | "scope" : "openid email collab.drive" | ||
94 | } | ||
95 | {{/code}} | ||
96 | |||
97 | Fetch the access token from this response. | ||
98 | |||
99 | The next step is to get a Drive API token with the access token: | ||
100 | |||
101 | {{code language="bash"}} | ||
102 | # Set the access token value | ||
103 | clb_access_token=... | ||
104 | |||
105 | # Call the token endpoint | ||
106 | curl -X GET https://drive.ebrains.eu/api2/account/token/ \ | ||
107 | -H "Authorization: Bearer ${clb_access_token}" | ||
108 | {{/code}} | ||
109 | |||
110 | The response will look like (% class="mark" %)##1c1345da8a99b36168afef92ef7f83af8b4ca6f0##(%%). This is the API token that you will need to use in your ##Authorization## header to communicate with the Collaboratory.drive API. | ||
111 | |||
112 | {{warning}} | ||
113 | Note that, unlike access tokens , the Drive API token needs an authorisation header in the form of "##Authorization: **Token** your-api-token"## (and not Bearer!). | ||
114 | {{/warning}} | ||
115 | |||
116 | === Fetching your service account's default library === | ||
117 | |||
118 | Note: from this point, you can refer to the [[documentation of Seafile>>https://download.seafile.com/published/web-api/v2.1-admin/]] to make API calls to the Collaboratory.drive. | ||
119 | |||
120 | {{code language="bash"}} | ||
121 | # Set the API token | ||
122 | clb_api_token=... | ||
123 | |||
124 | # Call the default-repo endpoint | ||
125 | curl -X GET https://drive.ebrains.eu/api2/default-repo/ \ | ||
126 | -H "Authorization: Token ${clb_api_token}" | | ||
127 | |||
128 | # Pretty print the JSON response | ||
129 | json_pp | ||
130 | {{/code}} | ||
131 | |||
132 | You will get a response in the form of: | ||
133 | |||
134 | {{code language="javascript"}} | ||
135 | { | ||
136 | "repo_id": "175a7b2e-f9f5-4f3c-a9e9-84c4e995f1ea", | ||
137 | "exists": true | ||
138 | } | ||
139 | {{/code}} | ||
140 | |||
141 | The `repo_id` is the identifier of the default library of your service account. | ||
142 | |||
143 | === Creating a folder for a given user === | ||
144 | |||
145 | In order to isolate the data of individual users, you should create a folder for each user, based on a unique identifier of the user. The most secure identifier you can use is the `sub` as it is an internal unique identifier of IAM. | ||
146 | |||
147 | The `username` is also a valid unique identifier but, in some rare cases, they might be claimed by a different user in the future. | ||
148 | |||
149 | Creating a folder is done with the following call: | ||
150 | |||
151 | {{code language="bash"}} | ||
152 | # Set the API token, your library id and the folder path | ||
153 | clb_api_token=... | ||
154 | clb_repo_id=... | ||
155 | clb_folder_path=/my/user/folder/path | ||
156 | |||
157 | # Call the default-repo endpoint | ||
158 | curl -X POST \ | ||
159 | "https://drive.ebrains.eu/api2/repos/${clb_repo_id}/dir/" \ | ||
160 | -H "Authorization: Token ${clb_api_token}" \ | ||
161 | -d "p=${clb_folder_path}" | ||
162 | -F 'operation=mkdir' | | ||
163 | |||
164 | # Pretty print the JSON response | ||
165 | json_pp | ||
166 | {{/code}} | ||
167 | |||
168 | Please note that you cannot create a path of folders all at once. You will need to make one call at each level of your path. | ||
169 | |||
170 | === Writing data to the user space === | ||
171 | |||
172 | TODO | ||
173 | |||
174 | === Fetching data from the user space === | ||
175 | |||
176 | TODO | ||
177 | |||
178 | == Sharing data with notebooks == | ||
179 | |||
180 | The Collaboratory.drive is mounted in the Jupyter container of the user. This means that notebooks can access data stored in the Collaboratory.drive of the user. | ||
181 | |||
182 | Let's imagine your service generates data that should be ingested by a notebook. In order for the notebook to be usable by any user, the path to the data must be the same for each user. | ||
183 | |||
184 | When sharing a folder ##/path/to/the/shared_folder## with a user, the folder becomes reachable at the path ##drive/share with me/shared_folder##. As you can see, only the name of the folder is shared with the user. | ||
185 | |||
186 | Imagine you want to generate data per user and you want notebooks to refer to this data in a folder named ##//my-awesome-client//-data##. You will need to create a folder in your service account's default library in the following form: | ||
187 | |||
188 | ##/path/to/user/spaces/in/your/library/**${user-id}**///my-awesome-client//-data##. | ||
189 | |||
190 | Make sure the folder starts with your OIDC client name or some other unique name. Once the folder is created, you will need to share it with the user. | ||
191 | |||
192 | === Sharing a folder with a user === | ||
193 | |||
194 | {{code language="bash"}} | ||
195 | # Set the parameters | ||
196 | clb_api_token=... | ||
197 | clb_repo_id=... | ||
198 | clb_username=... | ||
199 | clb_folder_path="/path/to/user/spaces/in/your/library/${clb_username}/my-awesome-client-data" | ||
200 | clb_permission=rw # r=read, w=write | ||
201 | |||
202 | # Call the default-repo endpoint | ||
203 | curl -X PUT "https://drive.ebrains.eu/api2/repos/${clb_repo_id}/dir/shared_items/" \ | ||
204 | -H "Authorization: Token ${clb_api_token}" \ | ||
205 | -d "p=${clb_folder_path}" | ||
206 | -F 'share_type=user' \ | ||
207 | -F "permission=${clb_permission}" \ | ||
208 | -F "username=${clb_username}@humanbrainproject.eu" | ||
209 | {{/code}} |