Adding Nvidia GPU boost to Proxmox k8s using Pulumi and Kubespray

A step-by-step guide to leveraging Nvidia GPUs in Kubernetes

Madhan published on
5 min, 985 words

Homelab

In my previous article I talked about configuring GPU in VM level in Proxmox using GPU passthrough. However, the ultimate goal was to make this capability available to containerized workloads. In this post, I’ll walk you through on setting up the GPU enabled capability in the Kubernetes cluster.

Pre-flight check:

To ensure secure and automated access to the newly provisioned VMs, password authentication for the default ubuntu user has been disabled and now it relies on SSH keys. Before deployment, you must update the cloud-init.yml file with your own public SSH key. This cloud-init configuration automatically adds your key to the VM upon creation, allowing you to connect securely without a password.

Updating cluster inventory

Inventory file has been simplified with the recent changes to accommodate the following features.

worker4 ansible_host=192.168.1.69 has_gpu=true node_labels="{'node-role.kubernetes.io/gpu': 'true'}"
  • has_gpu=true custom variable that acts as a flag

  • node_labels applies a label to the node

Dynamic networking configuration

GPU node uses a different network interface name enp6s18 than the other nodes ens18. Because kube-vip maintains high availability, which needs to know the correct interface on each node.

Instead of a static configuration, a dynamic solution is implemented in values.yml file using a Jinja2 template.

kube_vip_interface: "{% if has_gpu | default(false) %}enp6s18{% else %}ens18{% endif %}"

It checks for the has_gpu variable defined in the inventory file and Ansible automatically assigns the correct network interface for Kube-VIP on each nodes.

Deployment with Docker and Justfile

To improve consistency and simplify the deployment process, I’ve containerized the Kubespray execution using Docker. And to manage the Docker command, I’ve adopted a Justfile.

run-kubespray:
    docker run --rm -it --mount type=bind,source="$(pwd)/k8s_cluster_config",dst=/config \
        --mount type=bind,source="${HOME}/.ssh/id_ed25519",dst=/root/.ssh/id_ed25519 \
        quay.io/kubespray/kubespray:v2.28.0 bash -c "ansible-playbook -i /config/inventory/hosts.ini -e @/config/values.yml cluster.yml"

Now, setting up the entire cluster deployment is reduced to a single command just run-kubespray.

Cluster verification

After Kubespray successfully sets up cluster, configure kubectl on local machine. Copy the admin.conf file from a control plane node to local ~/.kube/config.

ssh ubuntu@<control-plane-node-ip> 'sudo cat /etc/kubernetes/admin.conf' > ~/.kube/config

Next, edit ~/.kube/config file. Find the server address and replace it with the load balancer IP defined in Kubespray values.yml file (e.g., https://192.168.1.10:6443). This ensures commands are sent to the highly-available endpoint instead of a single node.

Nodes list

Make sure all the nodes are joined by running the following command kubectl get nodes --show-labels

Installing NVIDIA GPU operator

The NVIDIA GPU Operator automates the management of all the necessary software components to provision GPUs in Kubernetes, including drivers, container runtimes, and monitoring tools . The easiest way to install it is with Helm.

helm repo add nvidia https://helm.ngc.nvidia.com/nvidia
helm repo update

Next, install the operator in a dedicated namespace.

helm install gpu-operator nvidia/gpu-operator \
    --namespace gpu-operator \
    --create-namespace \
    -f nvidia-setup/values.yml

This command deploys the operator, which will then automatically detect the GPU on the labeled worker node and installs all the necessary drivers and plugins.

Verifying GPU Integration

Describe the node and look for nvidia.com/gpu in the labels and nvidia.com/gpu under the Allocatable resources section.

kubectl describe node k8s-worker4

You should see an output similar to this.

Allocatable:
    cpu:                3400m
    ephemeral-storage:  114953451738
    hugepages-1Gi:      0
    hugepages-2Mi:      0
    memory:             7237188Ki
    nvidia.com/gpu:     1
    pods:               110

To confirm that everything is working, you can run a simple test pod that requests GPU resources and executes the nvidia-smi command.

Create a file named gpu-test.yaml with the following content.

apiVersion: v1
kind: Pod
metadata:
    name: cuda-smi-test
spec:
    # Ensures the pod is only scheduled on a node with the GPU label
    nodeSelector:
    node-role.kubernetes.io/gpu: "true"
    restartPolicy: OnFailure
    containers:
    - name: cuda-test-container
        image: "nvidia/cuda:12.8.1-devel-ubuntu22.04"
        command: ["nvidia-smi"]
        resources:
        requests:
            cpu: "250m"
            memory: "512Mi"
        limits:
            nvidia.com/gpu: "1"
            cpu: "1"
            memory: "1Gi"

Deploying the pod

kubectl apply -f nvidia-setup/gpu-test.yaml

After a few moments, check the logs of the pod.

kubectl logs pods/cuda-smi-test

If the installation was successful, the output will be the familiar nvidia-smi report, showing the details of the GPU, but this time generated from within a Kubernetes pod . With these changes, my homelab is now fully equipped to run GPU-accelerated workloads directly within Kubernetes, marking a significant milestone in its evolution.

nvidia-smi output

Final thoughts

This post demonstrates how to build a Kubernetes cluster on Proxmox using a mutable infrastructure model. Future goal is to explore an immutable setup and I plan to experiment with Talos to achieve this.

* * * *

Originally published on Medium

🌟 🌟 🌟 The source code for this blog post can be found here 🌟🌟🌟

GitHub - madhank93/homelab at v0.1.4

Reference:

[1] https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/index.html#

[2] https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_cpu_type

[3] https://kubespray.io/#/docs/ansible/inventory

[4] https://www.hashicorp.com/en/resources/what-is-mutable-vs-immutable-infrastructure