Configure an Ubuntu GPU node

Overview

Configuring an Ubuntu node with GPUs is a little more involved than a typical worker node. This guide will walk you through all the steps necessary to configure an Ubuntu node with attached GPUs so that they can be provisioned and used by Modzy in the Kubernetes cluster.

Update kernel features

In order to use the GPU for running models rather than for driving a display, we need to tell the kernel not to load in that particular module. To do that we'll be following the instructions provided by NVIDIA.

First copy the following file to /etc/modprobe.d/blacklist-nouveau.conf:

blacklist nouveau
options nouveau modeset=0

Next we need to tell the kernel to rebuild its init cache.

sudo update-initramfs -u

Reboot!

For the kernel changes to take effect, we need to reboot the machine.

Install Drivers

The first step is to ensure that we have the proper NVIDIA driver(s) installed for the attached hardware. The driver download page can be found here. You probably want the Product Type to be Data Center / Tesla, then select the product series and product for the GPU you have attached, and the operating system should be Linux 64-bit.

After downloading the driver as a *.run file, copy it to your GPU node and then run the following commands:

sudo apt update
sudo apt-get install build-essential
# Replace this .run file name with the file that you downloaded.
sudo sh NVIDIA-Linux-x86_64-510.73.08.run

Install NVIDIA Container Runtime

In order to use your NVIDIA GPU inside a container, a different container runtime needs to be installed that will be used by Docker/containerd to run containers on this server that can expose the GPU to the container.

We can install this from NVIDIA directly by adding their apt repository to our sources:

curl -s -L https://nvidia.github.io/nvidia-container-runtime/gpgkey | sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.list | sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list
sudo apt-get update
sudo apt-get install nvidia-container-runtime

Initialize K3s agent

Now that we have our pre-requisites installed, it's time to initialize, but not start the k3s agent for this node. We'll join it to our Kubernetes cluster the way we would a normal inference node except for two differences. 1) We'll tell it not to start the k3s agent immediately, and 2) add an additional taint to the node so that only pods requiring a GPU will run on this node.

curl -sfL https://get.k3s.io | INSTALL_K3S_SKIP_START=true INSTALL_K3S_VERSION=v1.21.12+k3s1 sh -s - agent --server https://<IP or DNS of primary node>:6443 --node-label "modzy.com/node-type=inference" --node-taint "modzy.com/inference-node=true:NoSchedule" --node-taint "nvidia.com/gpu=true:NoSchedule" --token <INSERT YOUR TOKEN HERE>

Using self-signed certificates

If you're using a self-signed certificate for your Modzy installation, there is one additional step you need to perform when creating your Inference nodes. By default, Docker (containerd specifically in this case), requires proper TLS validation before it will pull images from a registry. If you're using a self-signed certificate then Modzy's internal registry will fail these checks so we need to tell the daemon to skip these checks for Modzy's registry.

If we assume that you're installing Modzy under the domain modzy.example.com, then the registry will be exposed as registry.modzy.example.com. So we need to disable TLS validation for this domain.

On each of your Inference nodes, open a SSH session and create a file /etc/rancher/k3s/registries.yaml with the following content:

configs:
  "registry.modzy.example.com":
    tls:
      insecure_skip_verify: true

Replace registry.modzy.example.com with the domain that you're going to use.

Changing container runtimes

Now that k3s is installed, we need to modify its configuration to tell it to use the NVIDIA container runtime we installed rather than its default runtime.

To do that, we need to add a config file template to k3s' config directly that overrides the container runtime. First, let's make sure the directory exists:

sudo mkdir -p /var/lib/rancher/k3s/agent/etc/containerd

Next, copy the following file into that directory:

[plugins.opt]
  path = "{{ .NodeConfig.Containerd.Opt }}"
[plugins.cri]
  stream_server_address = "127.0.0.1"
  stream_server_port = "10010"
  enable_selinux = {{ .NodeConfig.SELinux }}
{{- if .DisableCgroup}}
  disable_cgroup = true
{{end}}
{{- if .IsRunningInUserNS }}
  disable_apparmor = true
  restrict_oom_score_adj = true
{{end}}
{{- if .NodeConfig.AgentConfig.PauseImage }}
  sandbox_image = "{{ .NodeConfig.AgentConfig.PauseImage }}"
{{end}}
{{- if .NodeConfig.AgentConfig.Snapshotter }}
[plugins.cri.containerd]
  snapshotter = "{{ .NodeConfig.AgentConfig.Snapshotter }}"
  disable_snapshot_annotations = {{ if eq .NodeConfig.AgentConfig.Snapshotter "stargz" }}false{{else}}true{{end}}
{{ if eq .NodeConfig.AgentConfig.Snapshotter "stargz" }}
{{ if .NodeConfig.AgentConfig.ImageServiceSocket }}
[plugins.stargz]
cri_keychain_image_service_path = "{{ .NodeConfig.AgentConfig.ImageServiceSocket }}"
[plugins.stargz.cri_keychain]
enable_keychain = true
{{end}}
{{ if .PrivateRegistryConfig }}
{{ if .PrivateRegistryConfig.Mirrors }}
[plugins.stargz.registry.mirrors]{{end}}
{{range $k, $v := .PrivateRegistryConfig.Mirrors }}
[plugins.stargz.registry.mirrors."{{$k}}"]
  endpoint = [{{range $i, $j := $v.Endpoints}}{{if $i}}, {{end}}{{printf "%q" .}}{{end}}]
{{if $v.Rewrites}}
  [plugins.stargz.registry.mirrors."{{$k}}".rewrite]
{{range $pattern, $replace := $v.Rewrites}}
    "{{$pattern}}" = "{{$replace}}"
{{end}}
{{end}}
{{end}}
{{range $k, $v := .PrivateRegistryConfig.Configs }}
{{ if $v.Auth }}
[plugins.stargz.registry.configs."{{$k}}".auth]
  {{ if $v.Auth.Username }}username = {{ printf "%q" $v.Auth.Username }}{{end}}
  {{ if $v.Auth.Password }}password = {{ printf "%q" $v.Auth.Password }}{{end}}
  {{ if $v.Auth.Auth }}auth = {{ printf "%q" $v.Auth.Auth }}{{end}}
  {{ if $v.Auth.IdentityToken }}identitytoken = {{ printf "%q" $v.Auth.IdentityToken }}{{end}}
{{end}}
{{ if $v.TLS }}
[plugins.stargz.registry.configs."{{$k}}".tls]
  {{ if $v.TLS.CAFile }}ca_file = "{{ $v.TLS.CAFile }}"{{end}}
  {{ if $v.TLS.CertFile }}cert_file = "{{ $v.TLS.CertFile }}"{{end}}
  {{ if $v.TLS.KeyFile }}key_file = "{{ $v.TLS.KeyFile }}"{{end}}
  {{ if $v.TLS.InsecureSkipVerify }}insecure_skip_verify = true{{end}}
{{end}}
{{end}}
{{end}}
{{end}}
{{end}}
{{- if not .NodeConfig.NoFlannel }}
[plugins.cri.cni]
  bin_dir = "{{ .NodeConfig.AgentConfig.CNIBinDir }}"
  conf_dir = "{{ .NodeConfig.AgentConfig.CNIConfDir }}"
{{end}}
[plugins.cri.containerd.runtimes.runc]
  # ---- changed from 'io.containerd.runc.v2' for GPU support
  runtime_type = "io.containerd.runtime.v1.linux"
# ---- added for GPU support
[plugins.linux]
  runtime = "nvidia-container-runtime"
{{ if .PrivateRegistryConfig }}
{{ if .PrivateRegistryConfig.Mirrors }}
[plugins.cri.registry.mirrors]{{end}}
{{range $k, $v := .PrivateRegistryConfig.Mirrors }}
[plugins.cri.registry.mirrors."{{$k}}"]
  endpoint = [{{range $i, $j := $v.Endpoints}}{{if $i}}, {{end}}{{printf "%q" .}}{{end}}]
{{if $v.Rewrites}}
  [plugins.cri.registry.mirrors."{{$k}}".rewrite]
{{range $pattern, $replace := $v.Rewrites}}
    "{{$pattern}}" = "{{$replace}}"
{{end}}
{{end}}
{{end}}
{{range $k, $v := .PrivateRegistryConfig.Configs }}
{{ if $v.Auth }}
[plugins.cri.registry.configs."{{$k}}".auth]
  {{ if $v.Auth.Username }}username = {{ printf "%q" $v.Auth.Username }}{{end}}
  {{ if $v.Auth.Password }}password = {{ printf "%q" $v.Auth.Password }}{{end}}
  {{ if $v.Auth.Auth }}auth = {{ printf "%q" $v.Auth.Auth }}{{end}}
  {{ if $v.Auth.IdentityToken }}identitytoken = {{ printf "%q" $v.Auth.IdentityToken }}{{end}}
{{end}}
{{ if $v.TLS }}
[plugins.cri.registry.configs."{{$k}}".tls]
  {{ if $v.TLS.CAFile }}ca_file = "{{ $v.TLS.CAFile }}"{{end}}
  {{ if $v.TLS.CertFile }}cert_file = "{{ $v.TLS.CertFile }}"{{end}}
  {{ if $v.TLS.KeyFile }}key_file = "{{ $v.TLS.KeyFile }}"{{end}}
  {{ if $v.TLS.InsecureSkipVerify }}insecure_skip_verify = true{{end}}
{{end}}
{{end}}
{{end}}

📘

NOTE

This file is a slightly modified copy of the default config template, except that the runtime_type (line 67) and the plugins.linux block (lines 69-70) have been modified to use the nvidia-container-runtime.

Now that the updated configuration is in place, we can start k3s:

sudo systemctl start k3s-agent

Install NVIDIA device plugin

The final step is to install the NVIDIA Kubernetes device plugin which will add the GPU to applicable nodes as a schedule-able resource. The plugin can be installed running kubectl -f nvidia-device-plugin.yaml using the following file:

# Copyright (c) 2019, NVIDIA CORPORATION.  All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: nvidia-device-plugin-daemonset
  namespace: kube-system
spec:
  selector:
    matchLabels:
      name: nvidia-device-plugin-ds
  updateStrategy:
    type: RollingUpdate
  template:
    metadata:
      # This annotation is deprecated. Kept here for backward compatibility
      # See https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/
      annotations:
        scheduler.alpha.kubernetes.io/critical-pod: ""
      labels:
        name: nvidia-device-plugin-ds
    spec:
      tolerations:
      # This toleration is deprecated. Kept here for backward compatibility
      # See https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/
      - key: CriticalAddonsOnly
        operator: Exists
      - key: nvidia.com/gpu
        operator: Exists
        effect: NoSchedule
      # Allow this pod to run on Modzy inference nodes
      - key: modzy.com/inference-node
        operator: Exists
        effect: NoSchedule
      # Mark this pod as a critical add-on; when enabled, the critical add-on
      # scheduler reserves resources for critical add-on pods so that they can
      # be rescheduled after a failure.
      # See https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/
      priorityClassName: "system-node-critical"
      containers:
      - image: nvcr.io/nvidia/k8s-device-plugin:v0.11.0
        name: nvidia-device-plugin-ctr
        args: ["--fail-on-init-error=false"]
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop: ["ALL"]
        volumeMounts:
          - name: device-plugin
            mountPath: /var/lib/kubelet/device-plugins
      volumes:
        - name: device-plugin
          hostPath:
            path: /var/lib/kubelet/device-plugins

Verify installation

Finally, to verify the installation, run kubectl describe node <node_name> for the node that has a GPU and you should notice that it now lists nvidia.com/gpu as an allocatable resource.

...
Capacity:
  cpu:                4
  ephemeral-storage:  101583780Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             16084400Ki
  nvidia.com/gpu:     1
  pods:               110
Allocatable:
  cpu:                4
  ephemeral-storage:  98820701107
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             16084400Ki
  nvidia.com/gpu:     1
  pods:               110
...

Did this page help you?