JupyterHub Deployment Proposal

Last modified by adietz on 2019/07/18 16:38

Draft

This document will address an initial release of jupyterlabs within the Collaboratory 2.0. In order to provide something quickly, this release will be limited in scope. Long term planning can be discussed here.

JupyterHub Deployment Proposal

This page describes the proposed JuptyerHub setup that will be offered as part of the "scientific release" of the Collaboratory 2.0 planned for Q3 2019.

The initial release should allow users to run notebooks in the Collaboratory 2.0 environment. The following features should be available from the start:

integration with the drive should be available, providing access to notebooks and data which are stored there and accessible by the user;
integration with the user's IAM account and OAuth tokens;
example notebooks to showcase some possibilities;
notebook versioning (limited);
preinstalled scientific and neuroscience libraries in a notebook container image; (libraries TBD).

The goal is to allow users to create new notebooks on the Collaboratory 2.0 platform and to migrate some, but not all, notebooks from the old Collaboratory to the new platform. Migrating notebooks might require some changes, especially with regards to the storage, but also with regards to the libraries used to access certain services and the access of services themselves.

The new JupyterHub deployment will not be fully compatible with the old environment. The storage model will change significantly. Certain libraries will no longer be available. Python 2 support needs to be evaluated (see the note at the bottom).

Architecture

Based on Kubernetes
Seafile storage mounted in user containers
Possibility of mounting large documents RO in certain images
Possibility of selecting from different images
Role based access to resources (quotas based on users' access level)

Components

Jupyterhub: 1.x
- Spawner: KubeSpawner
- Authenticator: OAuthenticator
Notebook image:
- Main notebook image: IPython: 7.x

Features

TBD: requirements management (how to manage python libraries and other dependencies).
TBD: Versioning of notebooks. Initially, this will only be through the versioning available in seafile.
TBD: Workflows