TECHNICAL INCIDENT REPORT NVIDIA Driver / AlmaLinux 9.7 Kernel Incompatibility

Problem

After a standard dnf update, the system booted on the latest kernels (611.45.1 then 611.49.1). NVIDIA module failed to load on both — SDDM would not start, leaving the server with no graphical interface.

Root cause: kmod-nvidia-open is a pre-compiled module tied to a specific kernel version. No DKMS = no automatic recompilation when the kernel changes. If no matching kmod-nvidia package is available in the repo for the new kernel at update time, the GPU driver silently breaks on next boot.


Workaround applied

bash

# Rolled back to last working kernel
grubby --set-default /boot/vmlinuz-5.14.0-611.41.1.el9_7.x86_64

# Locked kernel to prevent auto-update
dnf install python3-dnf-plugin-versionlock
dnf versionlock add kernel-0:5.14.0-611.41.1.el9_7.*
dnf versionlock add kernel-core-0:5.14.0-611.41.1.el9_7.*
dnf versionlock add kernel-modules-0:5.14.0-611.41.1.el9_7.*

# Removed broken kernels
dnf remove kernel-5.14.0-611.45.1.el9_7.x86_64
dnf remove kernel-5.14.0-611.49.1.el9_7.x86_64

System is now stable but kernel frozen — no security updates possible until NVIDIA kmod catches up.


Questions for the team

  1. Is a kmod-nvidia-open-595.58.03 package planned/available for kernels 611.45.1 or 611.49.1?

  2. Is there a roadmap to ship nvidia-driver-dkms in the official AlmaLinux NVIDIA repo to avoid this class of issue entirely?

  3. Should there be a dependency or conflict declared in the repo to prevent kernel from updating without a matching kmod-nvidia?


Full technical report available on request (Word doc with complete logs, timeline, and recommendations).

The kernel modules built by AlmaLinux are signed by same key as the rest of AlmaLinux kernel. This allows the modules to be loaded with Secure Boot enabled without importing additional certificates to the UEFI.

The key used by AlmaLinux absolutely cannot be given to users. Therefore, DKMS run by users cannot sign with that key.