How to Create a GPU-Powered Containerized Multi-User JupyterHub Research Server?

Photo by Joseph Greve on Unsplash

In this post, we will build a GPU-powered deep learning development server. We will use Docker to containerize the Jupyter Lab environment. The server also features a shared data and notebooks directories among the users.

Introduction

Development servers with high computing power are important for research groups. Because of that, there are multiple solutions from various companies. Google has its Colab environment as well as the Kaggle community. Those environments supply processing power including GPUs and TPUs to their users.

The main problem with these environments is to isolate each users’ environment from each other via a containerized solution. At this point, you may be heard of the name Docker/Kubernetes. Most of those solutions use Docker under the hood. If you have any experience with, for example, Docker, the chances that you might see the current Docker version text in the sidebar is very high.

Isolating environments from each other is one point. However, another point is the access management part of those isolated environments. In Kaggle, access management to the containers is embedded in the site itself. In Colab, this process is handled by Google Accounts. Thus, if you want to set up a site like those, you have to handle the access management by yourself.

Prerequisites

Concepts about Linux

Know how to use a Linux machine from remote via SSH.
Understand how network namespaces work under Linux.
Understand what is the mount concept and how mounting works under Linux.

Concepts about Docker

Some working experience with Docker.
Know how to create new images.
Know how to customize existing images.
Some knowledge about Docker Compose.

Concepts about Other Staff

Knowledge about Git will make your life easier.

JupyterHub

JupyterHub is an open-source multi-user version of the notebook designed for companies, classrooms and research labs. It is designed to manage user access to notebook servers via several authentication mechanisms. Essentially, JupyterHub is a web server running as a proxy to other isolated Jupyter Notebooks/Jupyter Labs used by different users. The following image is taken from the official documentation of JupyterHub and explains the working principle of it.

JHW

Authentication

JupyterHub provides some basic user management and administration features. For example, you can whitelist some users or blacklist them, assign special permissions to them, etc.

For the authentication, several methods is listed below:

Local Authenticator
OAuthenticator
Dummy Authenticator

There are more of them. You can check the documentation.

Local Authenticator manages users on the local system. If you add a new user to JupyterHub, the user is created for the system with the command adduser. If you do not want to change the system users, you should not use this authenticator.

OAuthenticator is a way to get access to protected data from an application. It is safer and more secure than asking users to log in with passwords. If you are planning to open JupyterHub to the public, this may be the right choice.

Dummy Authenticator is the simplest type of authenticator. It allows you to set any username-password pair. You can also set a global password and all users who know the correct global password can successfully login to the JupyterHub.

Notebook Spawners

JupyterHub’s main purpose is to spawning single-user notebooks for the authenticated users. A single-user notebook is just an instance of jupyter notebook. Jupyter notebook is available in many formats for different operating systems. One can install it via package managers (i.e. conda, pacman) or via pip. However, if you do not want to fill your machine with random junk, you can run notebooks on a Docker image. You can find official docker stacks in the Github.

For spawning notebook servers, JupyterHub provides the following methods:

Native Spawner
Docker Spawner

Again, there are lots of them. You can check the documentation for the full list.

Native Spawner spawns notebooks from the installed version of the Jupyter Notebook. This version uses the libraries and frameworks from the main system. To add new libraries or frameworks, you have to install them into the main system.

Docker Spawner spawns notebooks from a prepared Docker image. If you need to add libraries or frameworks to the notebooks, you have to add them to the respected Dockerfile and rebuild the Docker image. This spawner type has the main focus of this post.

Customizing the JupyterHub Docker Image

JupyterHub has an official Docker image. This image contains only the JupyterHub service itself. The notebook, authentication mechanisms and/or any configuration is not included. One has to derive a new image from this official image. The correct way of doing this is, of course, preparing a new Dockerfile.

Writing the Dockerfile for JupyterHub

The first thing to do in the Dockerfile is to setting the base image. In this case, base image is the official JupyterHub image. Add the following command as the first line of the Dockerfile of your custom JupyterHub image.

FROM jupyterhub/jupyterhub

Depending on the choice of spawners and authenticators, you have to install some packages. The following snippet installs dockerspawner and oauthenticator to spawn notebooks from a notebook image and using Github accounts for authenticaton.

RUN pip install dockerspawner oauthenticator

The final part is to copying the configuration file for the JupyterHub to the inside of the image by using the following command.

COPY jupyterhub_config.py .

This command assumes that the jupyterhub_config.py file is in the same directory with the Dockerfile. The overall view of the Dockerfile should look likes to the following snippet.

FROM jupyterhub/jupyterhub
RUN pip install dockerspawner oauthenticator
COPY jupyterhub_config.py .

When you build the image defined in the Dockerfile with the following command, you will have a ready-to-run JupyterHub image for the next phases.

$ docker build --rm -t jupyterhub .

As a refresher, the final dot(.) means the current directory. Here the tag of the prepared image is jupyterhub. If you run the following command, you can see the image.

$ docker image ls

Output looks like this:

REPOSITORY              TAG           IMAGE ID        CREATED             SIZE
jupyterhub              latest        cb9bfc0fc293    14 hours ago        935MB
salda-special           latest        4847a31b61b7    6 days ago          9.85GB
tensorflow-notebook     latest        7fe345b290f6    2 weeks ago         8.68GB
scipy-notebook          latest        5f5f71c0d6b9    2 weeks ago         6.89GB
minimal-notebook        latest        f9df51d25171    2 weeks ago         5.91GB
base-notebook           latest        328581e70c7b    2 weeks ago         4.25GB
pclink                  0.1.0-alpha   44ca70efcd30    2 months ago        660MB
nextcloud               latest        7ec24835eb2d    2 months ago        724MB
nginx                   1             6678c7c2e56c    2 months ago        127MB
mariadb                 10.4          1fd0e719c495    2 months ago        356MB
mariadb                 10.4.12       1fd0e719c495    2 months ago        356MB
python                  3-alpine      537dfdf79ddc    2 months ago        107MB
ubuntu                  18.04         72300a873c2c    2 months ago        64.2MB
ubuntu                  latest        a2a15febcdf3    9 months ago        64.2MB
jupyterhub/jupyterhub   latest        64d82994fd55    12 months ago       932MB
openproject/community   8             99757bbbc2a4    14 months ago       1.59GB

Configuring the JupyterHub

Previously mentioned jupyterhub_config.py file contains the necessary information to configure JupyterHub. You can see an example configuration in the following snippet.

import os

c.JupyterHub.spawner_class = "dockerspawner.DockerSpawner"
c.DockerSpawner.image = os.environ["DOCKER_JUPYTER_IMAGE"]
c.DockerSpawner.network_name = os.environ["DOCKER_NETWORK_NAME"]
c.JupyterHub.hub_ip = os.environ["HUB_IP"]
c.Authenticator.admin_users = {'user1', 'user2'}

from oauthenticator.github import GitHubOAuthenticator
c.JupyterHub.authenticator_class = GitHubOAuthenticator

c.GitHubOAuthenticator.oauth_callback_url = \
                    'http://<host_ip_addr>/hub/oauth_callback'
c.GitHubOAuthenticator.client_id = '<client_id>'
c.GitHubOAuthenticator.client_secret = '<client_secret>'

notebook_dir = os.environ.get('DOCKER_NOTEBOOK_DIR') or '/home/jovyan/work'
c.DockerSpawner.notebook_dir = notebook_dir

# Mount the real user's Docker volume on the host to the notebook user's
# notebook directory in the container
c.DockerSpawner.volumes = {
          'jupyterhub-user-{username}': notebook_dir,
          'jupyterhub-shared': '/home/jovyan/work/shared',
          'jupyterhub-data': '/home/jovyan/work/data'
}

c.DockerSpawner.remove_containers = True
c.Spawner.default_url = '/lab'

Spawner and Network Configurations

As you can see from the code snippet, the spawner_class attribute is set to the dockerspawner.DockerSpawner class. This means that when the user logs in the notebook is spawned in a Docker container. However, you also need to provide the image, container, network, and volume properties.

By using the image property of the DockerSpawner, you can set the image which will be used while creating the containers.
By using the network_name, you can set the network name. This is important because when the container is started you have to tell the Docker which network the newly created container will join.
By using the notebook_dir property, you can set the mount point for the Jupyter Notebook.

If you want persistent storage for your users. You have to assign a volume for each of them. The volumes property can be a dict() object. Keys of the dictionary are the volume names and values are the mount points. In this example, a volume for each user is created. Besides, if you want to share data among users, you can also set other fixed named volumes as well.

Then, you have to specify the hub_ip. This is the ip address of the JupyterHub container. In Docker, the ip addresses of the containers in network namespaces are resolved by using their names by internal DNS records. Those environment variables will be set in the docker-compose.yml file in a later section.

Authentication Configurations

You can use any of the previously mentioned authentication methods. In this example, OAuthentication with Github accounts is used.

To do this one have to import the GitHubOAuthenticator class and set it to the authenticator_class for the JupyterHub. After doing that, you can use the properties of the GitHubOAuthenticator. You should make the necessary configuration in the Github side and then set the oauth_callback_url, client_id, and client_secret variables.

The Github configuration for OAuth is quite simple. You should define a new application for your user. Each application with OAuth have a unique id and secret. You should paste the id and secret to the specified places in the configuration file.

Miscellaneous Configuration Options

The last two lines of the configuration file are optional.

The first one indicates that if a user stops her/his notebook server, then the corresponding container will be deleted.

The second one indicates that the default interface is the JupyterLab and not Jupyter Notebook. Jupyter Notebook is fine with its extensions as such. But, the interface is not as usable as JupyterLab’s interface.

Customizing the Docker Stack

Luckily, Jupyter project provides official docker stacks for various purposes. If you follow this link you can see many *-notebook images in the repository. These notebooks are in a hierarchical order. One can express this order as a tree structure.

base-notebook
- minimal-notebook
  - r-notebook
  - scipy-notebook
    - datascience-notebook
    - pyspark-notebook
      - all-spark-notebook
    - tensorflow-notebook

All notebooks are either directly or indirectly derived from base-notebook. This notebook directs the underlying system to install all the necessary packages to run a Jupyter (Notebook | Lab).

If you need any additional packages or libraries, you can clone this repository and add necessary packages and libraries to the any Dockerfile you want. However, if you changed any node other than a leaf in the hierarchy, you have to rebuild all the other dependent images to take your changes in effect.

For example, I generally make a brand new image from the tensorflow-notebook and add whatever libraries or packages that I want to install. By doing this, the original Dockerfiles stays untouched and the risk of failure is minimized.

Running Tensorflow with GPU Support

Running Tensorflow with GPU is a little tricky. However, if you follow the documentation of Tensorflow strictly, the chances that you will have any problems is low. Regardless of running the Tensorflow in Docker or not, you must have the NVIDIA drivers installed in the host computer.

To run the whole official docker stack with NVIDIA CUDA 10.1 and CUDNN 7.x (those are the requirements of Tensorflow v2.1.0), you have to change the base image of the base-notebook. The first line of the base-notebook should look like this.

ARG ROOT_CONTAINER=nvidia/cuda:10.1-cudnn7-devel-ubuntu18.04
ARG BASE_CONTAINER=$ROOT_CONTAINER
FROM $BASE_CONTAINER

With those lines, the base-notebook now relies on the official Docker image of NVIDIA. When you change the base-notebook you have to rebuild all the other images that built upon it. This means that you have to rebuild all other *-notebook images from scratch. This process is time consuming but straightforward.

From this point, all of your docker stack images have capable of running their codes on the GPU. To give the ability of using GPUs to Docker, you have to install the NVIDIA Container Toolkit. To do this, write the following commands to a terminal.

$ # Add the package repositories
$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
$ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
$ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
$ sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
$ sudo systemctl restart docker

After installing the nvidia-container-toolkit and restarting docker daemon, you can run the nvidia-smi command and see if you can access the GPU.

$ #### Test nvidia-smi with the latest official CUDA image
$ docker run --rm --gpus all nvidia/cuda:10.1-base nvidia-smi

You should see a result similar to this:

Sat May 16 20:38:33 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.64.00    Driver Version: 440.64.00    CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1080    On   | 00000000:01:00.0 Off |                  N/A |
|  0%   34C    P8     8W / 200W |   7997MiB /  8116MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      9174      C   /opt/conda/bin/python                       7985MiB |
+-----------------------------------------------------------------------------+

It means that you have successfuly installed the NVIDIA Container Toolkit and reached the onboard GPU.

The only remaining thing at this point is enabling GPU support for tensorflow. To do this one should edit the Dockerfile of the tensorflow-notebook. The line containing the following code:

# Install Tensorflow
RUN pip install --quiet \
    'tensorflow==2.1.0' && \
    fix-permissions $CONDA_DIR && \
    fix-permissions /home/$NB_USER

Should be changed to:

# Install Tensorflow
RUN pip install --quiet \
    'tensorflow-gpu==2.1.0' && \
    fix-permissions $CONDA_DIR && \
    fix-permissions /home/$NB_USER

That’s all. Now, the containers of this image can capable of using the onboard GPU.

Getting Everything Together - Docker Compose

Up to this point, we customize and build multiple images. These images need an orchestrator to work together. Although there are various choices that you can choose to do the orchestration, this document will explain how you can use Docker Compose in such a task.

Docker Compose is just a plugin to the Docker. Because of that, it is very lightweight. Docker Compose gets an YAML file. The YAML file contains the definitions of services, service properties and relationships between services.

The default name for the YAML file is docker-compose.yml. The directory structure including the docker-compose.yml is as follows.

.
├── docker-compose.yml
├── docker-stacks
│   ├── all-spark-notebook
│   ├── base-notebook
│   ├── binder
│   ├── CODE_OF_CONDUCT.md
│   ├── conftest.py
│   ├── CONTRIBUTING.md
│   ├── datascience-notebook
│   ├── docs
│   ├── examples
│   ├── LICENSE.md
│   ├── Makefile
│   ├── minimal-notebook
│   ├── pyspark-notebook
│   ├── pytest.ini
│   ├── README.md
│   ├── requirements-dev.txt
│   ├── r-notebook
│   ├── scipy-notebook
│   ├── tensorflow-notebook
│   └── test
└── jupyterhub
    ├── Dockerfile
    └── jupyterhub_config.py

21 directories, 11 files

As you can see from the directory structure, the docker-compose.yml stands together with directories named jupyterhub and the docker-stacks repository. The necessary information to prepare both of those directories are given in the previous sections.

The following is a sample YAML file for a complete containerized JupyterHub configuration.

version: '2.3'

services:
  jupyterhub:
    build: ./jupyterhub
    image: jupyterhub
    ports:
      - "80:8000"
    container_name: jupyterhub-container
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - jupyterhub_data:/srv/jupyterhub
    environment:
      DOCKER_JUPYTER_CONTAINER: jupyter-notebook
      DOCKER_JUPYTER_IMAGE: tensorflow-notebook
      DOCKER_NETWORK_NAME: ${COMPOSE_PROJECT_NAME}_default
      HUB_IP: jupyterhub

volumes:
  jupyterhub_data:

All docker-compose.yml files need a version key. This line tells the Docker Compose to use which version of the Docker Compose parser. Then, within the services key, one service is created, jupyterhub.

Docker Compose Configuration for JupyterHub Service

The build subkey indicates the place for the image to be used. Since the docker-compose.yml file lays in the same place with the jupyterhub directory, the value should be ./jupyterhub.

The value of image subkey is the name of the produced image when you build the service.

The ports subkey is an array and self-explanatory. It indicates the dest:source port forwarding.

container_name is the name of the container when you create an instance.

volumes are the volumes to be mounted. The left hand side of the string states the source. Source can be either a volume or a directory from the host machine. The first entry is a special one. Without this entry, JupyterHub cannot be able to spawn Jupyter Notebook containers. The second volume makes the database and configuration information persistent across restarts and rebuilds.

environment defines environment variables that will be defined in the JupyterHub container. Notice that those environment variables are the ones that are used in the JupyterHub configuration file.

Conclusion

Owning a research server allowing multiple users interact with it at the same time is an important asset for research teams. In this post, it was explained how to set up a server with these features. This type of posts are relatively important because there is not much information about getting those things together. I hope you are enjoyed and informed. Feel free to provide any feedback related or not related to the post. See you…

Categories