Problem
After a standard dnf update, the system booted on the latest kernels (611.45.1 then 611.49.1). NVIDIA module failed to load on both — SDDM would not start, leaving the server with no graphical interface.
Root cause: kmod-nvidia-open is a pre-compiled module tied to a specific kernel version. No DKMS = no automatic recompilation when the kernel changes. If no matching kmod-nvidia package is available in the repo for the new kernel at update time, the GPU driver silently breaks on next boot.
Workaround applied
bash
# Rolled back to last working kernel
grubby --set-default /boot/vmlinuz-5.14.0-611.41.1.el9_7.x86_64
# Locked kernel to prevent auto-update
dnf install python3-dnf-plugin-versionlock
dnf versionlock add kernel-0:5.14.0-611.41.1.el9_7.*
dnf versionlock add kernel-core-0:5.14.0-611.41.1.el9_7.*
dnf versionlock add kernel-modules-0:5.14.0-611.41.1.el9_7.*
# Removed broken kernels
dnf remove kernel-5.14.0-611.45.1.el9_7.x86_64
dnf remove kernel-5.14.0-611.49.1.el9_7.x86_64
System is now stable but kernel frozen — no security updates possible until NVIDIA kmod catches up.
Questions for the team
-
Is a
kmod-nvidia-open-595.58.03package planned/available for kernels611.45.1or611.49.1? -
Is there a roadmap to ship
nvidia-driver-dkmsin the official AlmaLinux NVIDIA repo to avoid this class of issue entirely? -
Should there be a dependency or conflict declared in the repo to prevent
kernelfrom updating without a matchingkmod-nvidia?
Full technical report available on request (Word doc with complete logs, timeline, and recommendations).