This article describes a workflow that you can follow to use the Collaboratory.drive as a backend for your service to be able to store and read data inside a privatre user space.
Solution description
Your Keycloak client can be setup to have a service account linked to it. This service account being seen as a user by Keycloak, it can log in the Collaboratory.drive to have its user account synchronised there.
From this point, everything is set up to let your service account create and share files and folders to existing users. This can be achieved by using the existing Seafile API (the tool behind the Collaboratory.drive).
Creating a service account
If needed, follow the guide to create an OpenID Connect client.
You will need to modify your client to allow service accounts:
clb_reg_token=...
# Update the client
curl -X PUT https://iam.humanbrainproject.eu/auth/realms/hbp/clients-registrations/default/my-awesome-client \
-H "Authorization: Bearer ${clb_reg_token}" \
-H 'Content-Type: application/json' \
-d '{
"clientId": "my-awesome-client",
"serviceAccountsEnabled": true
}' |
# Prettify the JSON response
json_pp;
Creating a user account for the service account in the Collaboratory.drive
This step requires admin privileges. Please send a request to support@humanbrainproject.eu in order to get help.
The steps for the admins are described are the following:
- get the service account sub
- enable the service account user
- impersonate the service account
- log in Collaboratory.drive
Getting the service account sub
One way to get the service account is to request a token with its credentials.
clb_client_id=...
clb_client_secret=...
# Call the token endpoint
curl -X POST https://iam-dev.humanbrainproject.eu/auth/realms/hbp/protocol/openid-connect/token \
-d 'grant_type=client_credentials' \
-d "client_id=${clb_client_id}" \
-d "client_secret=${clb_client_secret}" |
# Prettify the JSON response
json_pp;
Fetch the access token from the response and use a tool to decode its payload. https://jwt.io/ is one option. Copy the sub from the payload.
Enabling the service account user
Navigate to https://iam.humanbrainproject.eu/auth/admin/master/console/#/realms/hbp/users/$sub (replacing $sub with the value you got at the previous step).
Set the "Email Verified" value to "On" and remove any "Required User Actions".
You can now impersonate the user and log in the Collaboratory.drive, which will create the user account for the service account.
Creating the user space
The general process is the following:
- Fetch a token for your service account to be able to discuss with the drive API.
- Your service needs to get its default library id. It is where it will create the users' spaces.
- For a given user, your service should create a folder, using a unique identifier (either the sub, username or email).
From this point, your service can now store and read data linked to users accounts.
If your service needs the data to be available in notebooks, it will need to share it with the user:
- Inside the user folder, your service should create a folder with a name that would be common across users.
- Your service should now share the inside folder with the user.
This way, notebooks will be able to refer to your service data with a common path for every user.
Getting an API token
The first step is to fetch an access token for your service account. This can be done with the following request:
clb_client_id=...
clb_client_secret=...
# Call the token endpoint
curl -X POST https://iam-dev.humanbrainproject.eu/auth/realms/hbp/protocol/openid-connect/token \
-d 'grant_type=client_credentials' \
-d "client_id=${clb_client_id}" \
-d "client_secret=${clb_client_secret}" |
# Prettify the JSON response
json_pp;
The response will be similar to:
"expires_in" : 108000,
"not-before-policy" : 1563261088,
"access_token" : "eyJhbGciOiJSU...",
"session_state" : "4882dbae-56dc-4a91-b8ae-8ad07117d4af",
"refresh_expires_in" : 14400,
"refresh_token" : "eyJhbGciOiJIUz...",
"token_type" : "bearer",
"scope" : "openid roles email"
}
Fetch the access token from this response.
The next step is to get an API token with the access token:
clb_access_token=...
# Call the token endpoint
curl -X GET https://drive.humanbrainproject.eu/api2/account/token/ \
-H "Authorization: Bearer ${clb_access_token}"
The response will look like 1c1345da8a99b36168afef92df7f83af8b4ca6f0. This is the API token that you will need to use in your Authorization header to discuss with the Collaboratory.drive API.
Fetching your service account default library
Note: from this point, you can refer to the documentation of Seafile to make API calls to the Collaboratory.drive.
clb_api_token=...
# Call the default-repo endpoint
curl -X GET https://drive.humanbrainproject.eu/api2/default-repo/ \
-H "Authorization: Token ${clb_api_token}" |
# Prettify the JSON response
json_pp;
You will get a response in the form of:
"repo_id": "175a7b2e-f9f5-4f3c-a9e9-84c4e995f1ea",
"exists": true
}
The `repo_id` is the identifier of the default library of your service account.
Creating a folder for a given user
In order to separate your users data, you should create a folder for each user, based on a unique identifier of the user. The most secure identifier you can use is the `sub` as it is an internal unique identifier of IAM.
The `username` or `email` are also valid unique identifier but, in some rare cases, they might be claimed by a different user in the future.
Creating a folder is done with the following call:
clb_api_token=...
clb_repo_id=...
clb_folder_path=/my/user/folder/path
# Call the default-repo endpoint
curl -X POST \
"https://drive.humanbrainproject.eu/api2/repos/${clb_repo_id}/dir/" \
-H "Authorization: Token ${clb_api_token}" \
-d "p=${clb_folder_path}"
-F 'operation=mkdir' |
# Prettify the JSON response
json_pp;
Please note that you cannot create a path of folders all at once. You will need to make one call at each level of your path.
Writing data in the user space
TODO
Fetching data from the user space
TODO
Sharing data with notebooks
The Collaboratory.drive is mounted in the Jupyter container of the user. This means that notebooks can access data stored in the Collaboratory.drive of the user.
Let's imagine your service generates data that should be ingested by a notebook. In order for the notebook to be usable by any user, the path to the data must be the same for each user.
When sharing a folder /path/to/the/shared_folder to a user, the folder gets reachable at the path drive/share with me/shared_folder. As you can see, only the name of the folder is shared with the user.
Imagine you want to generate data per user and you want notebooks to refer to this data in a folder named my-app-data. You will need to create a folder in your service account default library in the following form: /path/to/user/spaces/in/your/library/${user-id}/my-app-data.
Once the folder is created, you will need to share it with the user.
Sharing a folder with a user
clb_api_token=...
clb_repo_id=...
clb_username=...
clb_folder_path="/path/to/user/spaces/in/your/library/${clb_username}/my-app-data"
clb_permission=rw # r=read, w=write
# Call the default-repo endpoint
curl -X PUT "https://drive.humanbrainproject.eu/api2/repos/${clb_repo_id}/dir/shared_items/" \
-H "Authorization: Token ${clb_api_token}" \
-d "p=${clb_folder_path}"
-F 'share_type=user' \
-F "permission=${clb_permission}" \
-F "username=${clb_username}@humanbrainproject.eu"
You will get a response in the form of: