Configure an Ubuntu GPU node
Overview
Configuring an Ubuntu node with GPUs is a little more involved than a typical worker node. This guide will walk you through all the steps necessary to configure an Ubuntu node with attached GPUs so that they can be provisioned and used by Modzy in the Kubernetes cluster.
Update kernel features
In order to use the GPU for running models rather than for driving a display, we need to tell the kernel not to load in that particular module. To do that we'll be following the instructions provided by NVIDIA.
First copy the following file to /etc/modprobe.d/blacklist-nouveau.conf
:
blacklist nouveau
options nouveau modeset=0
Next we need to tell the kernel to rebuild its init cache.
sudo update-initramfs -u
Reboot!
For the kernel changes to take effect, we need to reboot the machine.
Install Drivers
The first step is to ensure that we have the proper NVIDIA driver(s) installed for the attached hardware. The driver download page can be found here. You probably want the Product Type
to be Data Center / Tesla
, then select the product series and product for the GPU you have attached, and the operating system should be Linux 64-bit
.
After downloading the driver as a *.run
file, copy it to your GPU node and then run the following commands:
sudo apt update
sudo apt-get install build-essential
# Replace this .run file name with the file that you downloaded.
sudo sh NVIDIA-Linux-x86_64-510.73.08.run
Install NVIDIA Container Runtime
In order to use your NVIDIA GPU inside a container, a different container runtime needs to be installed that will be used by Docker/containerd to run containers on this server that can expose the GPU to the container.
We can install this from NVIDIA directly by adding their apt
repository to our sources:
curl -s -L https://nvidia.github.io/nvidia-container-runtime/gpgkey | sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.list | sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list
sudo apt-get update
sudo apt-get install nvidia-container-runtime
Initialize K3s agent
Now that we have our pre-requisites installed, it's time to initialize, but not start the k3s agent for this node. We'll join it to our Kubernetes cluster the way we would a normal inference node except for two differences. 1) We'll tell it not to start the k3s agent immediately, and 2) add an additional taint to the node so that only pods requiring a GPU will run on this node.
curl -sfL https://get.k3s.io | INSTALL_K3S_SKIP_START=true INSTALL_K3S_VERSION=v1.21.12+k3s1 sh -s - agent --server https://<IP or DNS of primary node>:6443 --node-label "modzy.com/node-type=inference" --node-taint "modzy.com/inference-node=true:NoSchedule" --node-taint "nvidia.com/gpu=true:NoSchedule" --token <INSERT YOUR TOKEN HERE>
Using self-signed certificates
If you're using a self-signed certificate for your Modzy installation, there is one additional step you need to perform when creating your Inference nodes. By default, Docker (containerd specifically in this case), requires proper TLS validation before it will pull images from a registry. If you're using a self-signed certificate then Modzy's internal registry will fail these checks so we need to tell the daemon to skip these checks for Modzy's registry.
If we assume that you're installing Modzy under the domain modzy.example.com
, then the registry will be exposed as registry.modzy.example.com
. So we need to disable TLS validation for this domain.
On each of your Inference nodes, open a SSH session and create a file /etc/rancher/k3s/registries.yaml
with the following content:
configs:
"registry.modzy.example.com":
tls:
insecure_skip_verify: true
Replace registry.modzy.example.com
with the domain that you're going to use.
Changing container runtimes
Now that k3s is installed, we need to modify its configuration to tell it to use the NVIDIA container runtime we installed rather than its default runtime.
To do that, we need to add a config file template to k3s' config directly that overrides the container runtime. First, let's make sure the directory exists:
sudo mkdir -p /var/lib/rancher/k3s/agent/etc/containerd
Next, copy the following file into that directory:
[plugins.opt]
path = "{{ .NodeConfig.Containerd.Opt }}"
[plugins.cri]
stream_server_address = "127.0.0.1"
stream_server_port = "10010"
enable_selinux = {{ .NodeConfig.SELinux }}
{{- if .DisableCgroup}}
disable_cgroup = true
{{end}}
{{- if .IsRunningInUserNS }}
disable_apparmor = true
restrict_oom_score_adj = true
{{end}}
{{- if .NodeConfig.AgentConfig.PauseImage }}
sandbox_image = "{{ .NodeConfig.AgentConfig.PauseImage }}"
{{end}}
{{- if .NodeConfig.AgentConfig.Snapshotter }}
[plugins.cri.containerd]
snapshotter = "{{ .NodeConfig.AgentConfig.Snapshotter }}"
disable_snapshot_annotations = {{ if eq .NodeConfig.AgentConfig.Snapshotter "stargz" }}false{{else}}true{{end}}
{{ if eq .NodeConfig.AgentConfig.Snapshotter "stargz" }}
{{ if .NodeConfig.AgentConfig.ImageServiceSocket }}
[plugins.stargz]
cri_keychain_image_service_path = "{{ .NodeConfig.AgentConfig.ImageServiceSocket }}"
[plugins.stargz.cri_keychain]
enable_keychain = true
{{end}}
{{ if .PrivateRegistryConfig }}
{{ if .PrivateRegistryConfig.Mirrors }}
[plugins.stargz.registry.mirrors]{{end}}
{{range $k, $v := .PrivateRegistryConfig.Mirrors }}
[plugins.stargz.registry.mirrors."{{$k}}"]
endpoint = [{{range $i, $j := $v.Endpoints}}{{if $i}}, {{end}}{{printf "%q" .}}{{end}}]
{{if $v.Rewrites}}
[plugins.stargz.registry.mirrors."{{$k}}".rewrite]
{{range $pattern, $replace := $v.Rewrites}}
"{{$pattern}}" = "{{$replace}}"
{{end}}
{{end}}
{{end}}
{{range $k, $v := .PrivateRegistryConfig.Configs }}
{{ if $v.Auth }}
[plugins.stargz.registry.configs."{{$k}}".auth]
{{ if $v.Auth.Username }}username = {{ printf "%q" $v.Auth.Username }}{{end}}
{{ if $v.Auth.Password }}password = {{ printf "%q" $v.Auth.Password }}{{end}}
{{ if $v.Auth.Auth }}auth = {{ printf "%q" $v.Auth.Auth }}{{end}}
{{ if $v.Auth.IdentityToken }}identitytoken = {{ printf "%q" $v.Auth.IdentityToken }}{{end}}
{{end}}
{{ if $v.TLS }}
[plugins.stargz.registry.configs."{{$k}}".tls]
{{ if $v.TLS.CAFile }}ca_file = "{{ $v.TLS.CAFile }}"{{end}}
{{ if $v.TLS.CertFile }}cert_file = "{{ $v.TLS.CertFile }}"{{end}}
{{ if $v.TLS.KeyFile }}key_file = "{{ $v.TLS.KeyFile }}"{{end}}
{{ if $v.TLS.InsecureSkipVerify }}insecure_skip_verify = true{{end}}
{{end}}
{{end}}
{{end}}
{{end}}
{{end}}
{{- if not .NodeConfig.NoFlannel }}
[plugins.cri.cni]
bin_dir = "{{ .NodeConfig.AgentConfig.CNIBinDir }}"
conf_dir = "{{ .NodeConfig.AgentConfig.CNIConfDir }}"
{{end}}
[plugins.cri.containerd.runtimes.runc]
# ---- changed from 'io.containerd.runc.v2' for GPU support
runtime_type = "io.containerd.runtime.v1.linux"
# ---- added for GPU support
[plugins.linux]
runtime = "nvidia-container-runtime"
{{ if .PrivateRegistryConfig }}
{{ if .PrivateRegistryConfig.Mirrors }}
[plugins.cri.registry.mirrors]{{end}}
{{range $k, $v := .PrivateRegistryConfig.Mirrors }}
[plugins.cri.registry.mirrors."{{$k}}"]
endpoint = [{{range $i, $j := $v.Endpoints}}{{if $i}}, {{end}}{{printf "%q" .}}{{end}}]
{{if $v.Rewrites}}
[plugins.cri.registry.mirrors."{{$k}}".rewrite]
{{range $pattern, $replace := $v.Rewrites}}
"{{$pattern}}" = "{{$replace}}"
{{end}}
{{end}}
{{end}}
{{range $k, $v := .PrivateRegistryConfig.Configs }}
{{ if $v.Auth }}
[plugins.cri.registry.configs."{{$k}}".auth]
{{ if $v.Auth.Username }}username = {{ printf "%q" $v.Auth.Username }}{{end}}
{{ if $v.Auth.Password }}password = {{ printf "%q" $v.Auth.Password }}{{end}}
{{ if $v.Auth.Auth }}auth = {{ printf "%q" $v.Auth.Auth }}{{end}}
{{ if $v.Auth.IdentityToken }}identitytoken = {{ printf "%q" $v.Auth.IdentityToken }}{{end}}
{{end}}
{{ if $v.TLS }}
[plugins.cri.registry.configs."{{$k}}".tls]
{{ if $v.TLS.CAFile }}ca_file = "{{ $v.TLS.CAFile }}"{{end}}
{{ if $v.TLS.CertFile }}cert_file = "{{ $v.TLS.CertFile }}"{{end}}
{{ if $v.TLS.KeyFile }}key_file = "{{ $v.TLS.KeyFile }}"{{end}}
{{ if $v.TLS.InsecureSkipVerify }}insecure_skip_verify = true{{end}}
{{end}}
{{end}}
{{end}}
NOTE
This file is a slightly modified copy of the default config template, except that the
runtime_type
(line 67) and theplugins.linux
block (lines 69-70) have been modified to use thenvidia-container-runtime
.
Now that the updated configuration is in place, we can start k3s:
sudo systemctl start k3s-agent
Install NVIDIA device plugin
The final step is to install the NVIDIA Kubernetes device plugin which will add the GPU to applicable nodes as a schedule-able resource. The plugin can be installed running kubectl -f nvidia-device-plugin.yaml
using the following file:
# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: nvidia-device-plugin-daemonset
namespace: kube-system
spec:
selector:
matchLabels:
name: nvidia-device-plugin-ds
updateStrategy:
type: RollingUpdate
template:
metadata:
# This annotation is deprecated. Kept here for backward compatibility
# See https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ""
labels:
name: nvidia-device-plugin-ds
spec:
tolerations:
# This toleration is deprecated. Kept here for backward compatibility
# See https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/
- key: CriticalAddonsOnly
operator: Exists
- key: nvidia.com/gpu
operator: Exists
effect: NoSchedule
# Allow this pod to run on Modzy inference nodes
- key: modzy.com/inference-node
operator: Exists
effect: NoSchedule
# Mark this pod as a critical add-on; when enabled, the critical add-on
# scheduler reserves resources for critical add-on pods so that they can
# be rescheduled after a failure.
# See https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/
priorityClassName: "system-node-critical"
containers:
- image: nvcr.io/nvidia/k8s-device-plugin:v0.11.0
name: nvidia-device-plugin-ctr
args: ["--fail-on-init-error=false"]
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
volumeMounts:
- name: device-plugin
mountPath: /var/lib/kubelet/device-plugins
volumes:
- name: device-plugin
hostPath:
path: /var/lib/kubelet/device-plugins
Verify installation
Finally, to verify the installation, run kubectl describe node <node_name>
for the node that has a GPU and you should notice that it now lists nvidia.com/gpu
as an allocatable resource.
...
Capacity:
cpu: 4
ephemeral-storage: 101583780Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 16084400Ki
nvidia.com/gpu: 1
pods: 110
Allocatable:
cpu: 4
ephemeral-storage: 98820701107
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 16084400Ki
nvidia.com/gpu: 1
pods: 110
...
Updated over 1 year ago