Nvidia nvfs dkms issue

Message
Author
User avatar
Antro
Posts: 27
Joined: Sat Oct 30, 2021 7:31 am

Nvidia nvfs dkms issue

#1 Post by Antro »

Hi there!
I'm still experimenting issues with nvidia modules dkms: latest working kernel is 6.11.10 neither 6.12 nor 6.14 family can correctly compile modules (ahs, liquorix, ...). The kernels alone without modules work as expected, obviously only with console output. Here the latest attempt with liquorix branch

Code: Select all

$ sudo apt-get -f install linux-config-6.12 linux-headers-6.14.2-1-liquorix-amd64 linux-headers-liquorix-amd64
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
linux-config-6.12 is already the newest version (6.12.21-1~mx23ahs).
linux-headers-6.14.2-1-liquorix-amd64 is already the newest version (6.14-3~mx23ahs+1).
linux-headers-liquorix-amd64 is already the newest version (6.14-3~mx23ahs+1).
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
2 not fully installed or removed.
After this operation, 0 B of additional disk space will be used.
Do you want to continue? [Y/n] Y
Setting up linux-headers-6.14.2-1-liquorix-amd64 (6.14-3~mx23ahs+1) ...
/etc/kernel/header_postinst.d/dkms:
dkms: running auto installation service for kernel 6.14.2-1-liquorix-amd64.
/usr/sbin/dkms.mx autoinstall --kernelver 6.14.2-1-liquorix-amd64
Sign command: /lib/modules/6.14.2-1-liquorix-amd64/build/scripts/sign-file
Binary /lib/modules/6.14.2-1-liquorix-amd64/build/scripts/sign-file not found, modules won't be signed

Building module:
Cleaning build area...
'make' -j16 KVER=6.14.2-1-liquorix-amd64 IGNORE_CC_MISMATCH='1'.................(bad exit status: 2)
Error! Bad return status for module build on kernel: 6.14.2-1-liquorix-amd64 (x86_64)
Consult /var/lib/dkms/nvidia-fs/2.13/build/make.log for more information.
Error! One or more modules failed to install during autoinstall.
Refer to previous errors for more information.
dkms: autoinstall for kernel: 6.14.2-1-liquorix-amd64 failed!
run-parts: /etc/kernel/header_postinst.d/dkms exited with return code 11
Failed to process /etc/kernel/header_postinst.d at /var/lib/dpkg/info/linux-headers-6.14.2-1-liquorix-amd64.postinst line 11.
dpkg: error processing package linux-headers-6.14.2-1-liquorix-amd64 (--configure):
 installed linux-headers-6.14.2-1-liquorix-amd64 package post-installation script subprocess returned error exit status 1
dpkg: dependency problems prevent configuration of linux-headers-liquorix-amd64:
 linux-headers-liquorix-amd64 depends on linux-headers-6.14.2-1-liquorix-amd64 (= 6.14-3~mx23ahs+1); however:
  Package linux-headers-6.14.2-1-liquorix-amd64 is not configured yet.

dpkg: error processing package linux-headers-liquorix-amd64 (--configure):
 dependency problems - leaving unconfigured
Errors were encountered while processing:
 linux-headers-6.14.2-1-liquorix-amd64
 linux-headers-liquorix-amd64
E: Sub-process /usr/bin/dpkg returned an error code (1)
And this is the make.log

Code: Select all

DKMS make.log for nvidia-fs-2.13 for kernel 6.14.2-1-liquorix-amd64 (x86_64)
Mon Apr 21 10:53:18 AM CEST 2025
./configure 6.14.2-1-liquorix-amd64
Picking NVIDIA driver sources from NVIDIA_SRC_DIR=/usr/src/nvidia-current-535.216.03/nvidia-peermem. If that does not meet your expectation, you might have a stale driver still around and that might caus
e problems.
chmod +x ./create_nv.symvers.sh
./create_nv.symvers.sh 6.14.2-1-liquorix-amd64
Getting symbol versions from /lib/modules/6.14.2-1-liquorix-amd64/updates/dkms/nvidia-current.ko ...
Created: /var/lib/dkms/nvidia-fs/2.13/build/nv.symvers
cat nv.symvers >> Module.symvers
checking if uaccess.h access_ok has 3 parameters... no
checking if uaccess.h access_ok has 2 parameters... yes
Checking if blkdev.h has blk_rq_payload_bytes... yes
Checking if fs.h has call_read_iter and call_write_iter... no
Checking if fs.h has filemap_range_has_page... no
Checking if kiocb structue has ki_complete field... yes
Checking if KI_COMPLETE has 3 parameters ... no
Checking if vm_fault_t exist in mm_types.h... yes
Checking if IOCB_HIPRI flag exists in fs.h... yes
Checking if enum PCIE_SPEED_32_0GT exists in pci.h... yes
Checking if atomic64_t counter is of type long... no
Checking if RQF_COPY_USER is present or not... no
Checking if dma_drain_size and dma_drain_needed are present in struct request_queue... no
Checking if struct proc_ops is present or not ... yes
Checking if split is present in vm_operations_struct or not ... no
Checking if mremap in vm_operations_struct has one parameter... yes
Checking if mremap in vm_operations_struct has two parameters... no
Checking if symbol module_mutex is present... no
Checking if blk-integrity.h is present... yes
KCPPFLAGS="-DCONFIG_NVFS_STATS=y -DGDS_VERSION=1.4.0.29 -DNVFS_ENABLE_KERN_RDMA_SUPPORT -DNVFS_BATCH_SUPPORT=y" CONFIG_NVFS_BATCH_SUPPORT=y CONFIG_NVFS_STATS=y make -j4 -C /lib/modules/6.14.2-1-liquorix-
amd64/build  M=$PWD modules
make[1]: warning: -j4 forced in submake: resetting jobserver mode.
make[1]: Entering directory '/usr/src/linux-headers-6.14.2-1-liquorix-amd64'
make[2]: Entering directory '/var/lib/dkms/nvidia-fs/2.13/build'
  CC [M]  nvfs-core.o
  CC [M]  nvfs-dma.o
  CC [M]  nvfs-mmap.o
  CC [M]  nvfs-pci.o
In file included from nvfs-dma.c:28:
nvfs-core.h:47: warning: "MAX" redefined
   47 | #define MAX(x, y) (((x) > (y)) ? (x) : (y))
      | 
In file included from /usr/src/linux-headers-6.14.2-1-liquorix-amd64/include/linux/kernel.h:28,
                 from /usr/src/linux-headers-6.14.2-1-liquorix-amd64/include/linux/cpumask.h:11,
                 from /usr/src/linux-headers-6.14.2-1-liquorix-amd64/arch/x86/include/asm/paravirt.h:21,
                 from /usr/src/linux-headers-6.14.2-1-liquorix-amd64/arch/x86/include/asm/cpuid.h:71,
                 from /usr/src/linux-headers-6.14.2-1-liquorix-amd64/arch/x86/include/asm/processor.h:19,
                 from /usr/src/linux-headers-6.14.2-1-liquorix-amd64/include/linux/sched.h:13,
                 from /usr/src/linux-headers-6.14.2-1-liquorix-amd64/include/linux/ratelimit.h:6,
                 from /usr/src/linux-headers-6.14.2-1-liquorix-amd64/include/linux/dev_printk.h:16,
                 from /usr/src/linux-headers-6.14.2-1-liquorix-amd64/include/linux/device.h:15,
                 from /usr/src/linux-headers-6.14.2-1-liquorix-amd64/include/linux/dma-mapping.h:5,
                 from nvfs-dma.c:24:
/usr/src/linux-headers-6.14.2-1-liquorix-amd64/include/linux/minmax.h:315: note: this is the location of the previous definition
  315 | #define MAX(a, b) __cmp(max, a, b)
      | 
nvfs-core.h:48: warning: "MIN" redefined
   48 | #define MIN(x, y) (((x) < (y)) ? (x) : (y))
      | 
/usr/src/linux-headers-6.14.2-1-liquorix-amd64/include/linux/minmax.h:314: note: this is the location of the previous definition
  314 | #define MIN(a, b) __cmp(min, a, b)
      | 
In file included from nvfs-pci.c:32:
nvfs-core.h:47: warning: "MAX" redefined
   47 | #define MAX(x, y) (((x) > (y)) ? (x) : (y))
      | 
In file included from /usr/src/linux-headers-6.14.2-1-liquorix-amd64/include/linux/ioport.h:15,
                 from /usr/src/linux-headers-6.14.2-1-liquorix-amd64/include/linux/pci.h:31,
                 from nvfs-pci.c:25:
/usr/src/linux-headers-6.14.2-1-liquorix-amd64/include/linux/minmax.h:315: note: this is the location of the previous definition
  315 | #define MAX(a, b) __cmp(max, a, b)
      | 
nvfs-core.h:48: warning: "MIN" redefined
   48 | #define MIN(x, y) (((x) < (y)) ? (x) : (y))
      | 
/usr/src/linux-headers-6.14.2-1-liquorix-amd64/include/linux/minmax.h:314: note: this is the location of the previous definition
  314 | #define MIN(a, b) __cmp(min, a, b)
      | 
nvfs-pci.c:162:14: warning: no previous prototype for ‘nvfs_create_gpu_hash_entry’ [-Wmissing-prototypes]
  162 | unsigned int nvfs_create_gpu_hash_entry(uint64_t pdevinfo)
      |              ^~~~~~~~~~~~~~~~~~~~~~~~~~
nvfs-pci.c:175:14: warning: no previous prototype for ‘nvfs_create_peer_hash_entry’ [-Wmissing-prototypes]
  175 | unsigned int nvfs_create_peer_hash_entry(uint64_t pdevinfo)
      |              ^~~~~~~~~~~~~~~~~~~~~~~~~~~
nvfs-pci.c:207:14: warning: no previous prototype for ‘nvfs_get_peer_hash_index’ [-Wmissing-prototypes]
  207 | unsigned int nvfs_get_peer_hash_index(uint64_t pdevinfo)
      |              ^~~~~~~~~~~~~~~~~~~~~~~~
nvfs-pci.c: In function ‘__nvfs_find_all_device_paths’:
nvfs-pci.c:312:40: warning: implicit conversion from ‘enum pcie_link_width’ to ‘enum pci_bus_speed’ [-Wenum-conversion]
  312 |         enum pci_bus_speed lnk_speed = PCIE_LNK_WIDTH_RESRV;
      |                                        ^~~~~~~~~~~~~~~~~~~~
nvfs-pci.c:313:42: warning: implicit conversion from ‘enum pci_bus_speed’ to ‘enum pcie_link_width’ [-Wenum-conversion]
  313 |         enum pcie_link_width lnk_width = PCI_SPEED_UNKNOWN;
      |                                          ^~~~~~~~~~~~~~~~~
nvfs-pci.c: At top level:
nvfs-pci.c:729:10: warning: no previous prototype for ‘nvfs_aggregate_peer_usage_by_distance’ [-Wmissing-prototypes]
  729 | uint64_t nvfs_aggregate_peer_usage_by_distance(unsigned int gpu_index, unsigned int pci_dist) {
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from nvfs-dma.c:31:
nvfs-dma.h:31:10: fatal error: linux/blk-mq-pci.h: No such file or directory
   31 | #include <linux/blk-mq-pci.h>
      |          ^~~~~~~~~~~~~~~~~~~~
compilation terminated.
make[4]: *** [/usr/src/linux-headers-6.14.2-1-liquorix-amd64/scripts/Makefile.build:207: nvfs-dma.o] Error 1
make[4]: *** Waiting for unfinished jobs....
In file included from nvfs-core.c:57:
nvfs-core.h:47: warning: "MAX" redefined
   47 | #define MAX(x, y) (((x) > (y)) ? (x) : (y))
      | 
In file included from /usr/src/linux-headers-6.14.2-1-liquorix-amd64/include/linux/kernel.h:28,
                 from /usr/src/linux-headers-6.14.2-1-liquorix-amd64/include/linux/cpumask.h:11,
                 from /usr/src/linux-headers-6.14.2-1-liquorix-amd64/arch/x86/include/asm/paravirt.h:21,
                 from /usr/src/linux-headers-6.14.2-1-liquorix-amd64/arch/x86/include/asm/cpuid.h:71,
                 from /usr/src/linux-headers-6.14.2-1-liquorix-amd64/arch/x86/include/asm/processor.h:19,
                 from /usr/src/linux-headers-6.14.2-1-liquorix-amd64/include/linux/sched.h:13,
                 from /usr/src/linux-headers-6.14.2-1-liquorix-amd64/include/linux/ratelimit.h:6,
                 from /usr/src/linux-headers-6.14.2-1-liquorix-amd64/include/linux/dev_printk.h:16,
                 from /usr/src/linux-headers-6.14.2-1-liquorix-amd64/include/linux/device.h:15,
                 from nvfs-core.c:25:
/usr/src/linux-headers-6.14.2-1-liquorix-amd64/include/linux/minmax.h:315: note: this is the location of the previous definition
  315 | #define MAX(a, b) __cmp(max, a, b)
      | 
nvfs-core.h:48: warning: "MIN" redefined
   48 | #define MIN(x, y) (((x) < (y)) ? (x) : (y))
      | 
/usr/src/linux-headers-6.14.2-1-liquorix-amd64/include/linux/minmax.h:314: note: this is the location of the previous definition
  314 | #define MIN(a, b) __cmp(min, a, b)
      | 
In file included from nvfs-mmap.c:45:
nvfs-core.h:47: warning: "MAX" redefined
   47 | #define MAX(x, y) (((x) > (y)) ? (x) : (y))
      | 
In file included from /usr/src/linux-headers-6.14.2-1-liquorix-amd64/include/linux/kernel.h:28,
                 from /usr/src/linux-headers-6.14.2-1-liquorix-amd64/include/linux/cpumask.h:11,
                 from /usr/src/linux-headers-6.14.2-1-liquorix-amd64/arch/x86/include/asm/paravirt.h:21,
                 from /usr/src/linux-headers-6.14.2-1-liquorix-amd64/arch/x86/include/asm/cpuid.h:71,
                 from /usr/src/linux-headers-6.14.2-1-liquorix-amd64/arch/x86/include/asm/processor.h:19,
                 from /usr/src/linux-headers-6.14.2-1-liquorix-amd64/include/linux/sched.h:13,
                 from nvfs-mmap.c:25:
/usr/src/linux-headers-6.14.2-1-liquorix-amd64/include/linux/minmax.h:315: note: this is the location of the previous definition
  315 | #define MAX(a, b) __cmp(max, a, b)
      | 
nvfs-core.h:48: warning: "MIN" redefined
   48 | #define MIN(x, y) (((x) < (y)) ? (x) : (y))
      | 
/usr/src/linux-headers-6.14.2-1-liquorix-amd64/include/linux/minmax.h:314: note: this is the location of the previous definition
  314 | #define MIN(a, b) __cmp(min, a, b)
      | 
In file included from nvfs-core.c:59:
nvfs-dma.h:31:10: fatal error: linux/blk-mq-pci.h: No such file or directory
   31 | #include <linux/blk-mq-pci.h>
      |          ^~~~~~~~~~~~~~~~~~~~
compilation terminated.
make[4]: *** [/usr/src/linux-headers-6.14.2-1-liquorix-amd64/scripts/Makefile.build:207: nvfs-core.o] Error 1
In file included from nvfs-mmap.c:48:
nvfs-kernel-interface.h:29:10: fatal error: linux/blk-mq-pci.h: No such file or directory
   29 | #include <linux/blk-mq-pci.h>
      |          ^~~~~~~~~~~~~~~~~~~~
compilation terminated.
make[4]: *** [/usr/src/linux-headers-6.14.2-1-liquorix-amd64/scripts/Makefile.build:207: nvfs-mmap.o] Error 1
make[3]: *** [/usr/src/linux-headers-6.14.2-1-liquorix-amd64/Makefile:2004: .] Error 2
make[2]: *** [/usr/src/linux-headers-6.14.2-1-liquorix-amd64/Makefile:251: __sub-make] Error 2
make[2]: Leaving directory '/var/lib/dkms/nvidia-fs/2.13/build'
make[1]: *** [Makefile:251: __sub-make] Error 2
make[1]: Leaving directory '/usr/src/linux-headers-6.14.2-1-liquorix-amd64'
make: *** [Makefile:107: module] Error 2

nvidia package installedç

Code: Select all

$ sudo apt list --installed | grep nvidia

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

firmware-nvidia-graphics/mx,mx,now 20241210-1~mx23ahs all [installed,automatic]
firmware-nvidia-gsp/mx,now 535.216.03-3~mx23ahs amd64 [installed]
glx-alternative-nvidia/stable,now 1.2.2 amd64 [installed,automatic]
libegl-nvidia0/mx,now 535.216.03-3~mx23ahs amd64 [installed,automatic]
libgl1-nvidia-glvnd-glx/mx,now 535.216.03-3~mx23ahs amd64 [installed,automatic]
libgl1-nvidia-glvnd-glx/mx,now 535.216.03-3~mx23ahs i386 [installed]
libgles-nvidia1/mx,now 535.216.03-3~mx23ahs amd64 [installed]
libgles-nvidia2/mx,now 535.216.03-3~mx23ahs amd64 [installed]
libglx-nvidia0/mx,now 535.216.03-3~mx23ahs amd64 [installed,automatic]
libglx-nvidia0/mx,now 535.216.03-3~mx23ahs i386 [installed,automatic]
libnvidia-cfg1/mx,now 535.216.03-3~mx23ahs amd64 [installed]
libnvidia-egl-wayland1/stable,now 1:1.1.10-1 amd64 [installed,automatic]
libnvidia-eglcore/mx,now 535.216.03-3~mx23ahs amd64 [installed]
libnvidia-eglcore/mx,now 535.216.03-3~mx23ahs i386 [installed,automatic]
libnvidia-encode1/mx,now 535.216.03-3~mx23ahs amd64 [installed]
libnvidia-glcore/mx,now 535.216.03-3~mx23ahs amd64 [installed]
libnvidia-glcore/mx,now 535.216.03-3~mx23ahs i386 [installed,automatic]
libnvidia-glvkspirv/mx,now 535.216.03-3~mx23ahs amd64 [installed,automatic]
libnvidia-glvkspirv/mx,now 535.216.03-3~mx23ahs i386 [installed,automatic]
libnvidia-ml-dev/stable,now 11.8.86~11.8.0-5~deb12u1 amd64 [installed,automatic]
libnvidia-ml1/mx,now 535.216.03-3~mx23ahs amd64 [installed,automatic]
libnvidia-pkcs11-openssl3/mx,now 535.216.03-3~mx23ahs amd64 [installed,automatic]
libnvidia-ptxjitcompiler1/mx,now 535.216.03-3~mx23ahs amd64 [installed,automatic]
nvidia-alternative/mx,now 535.216.03-3~mx23ahs amd64 [installed,automatic]
nvidia-cuda-dev/stable,now 11.8.89~11.8.0-5~deb12u1 amd64 [installed,automatic]
nvidia-cuda-toolkit-doc/stable,stable,now 11.8.0-5~deb12u1 all [installed]
nvidia-cuda-toolkit-gcc/stable,now 11.8.0-5~deb12u1 amd64 [installed]
nvidia-cuda-toolkit/stable,now 11.8.89~11.8.0-5~deb12u1 amd64 [installed,automatic]
nvidia-detect/mx,now 535.216.03-3~mx23ahs amd64 [installed]
nvidia-driver-bin/mx,now 535.216.03-3~mx23ahs amd64 [installed,automatic]
nvidia-driver-libs/mx,now 535.216.03-3~mx23ahs amd64 [installed,automatic]
nvidia-driver/mx,now 535.216.03-3~mx23ahs amd64 [installed]
nvidia-egl-common/mx,now 535.216.03-3~mx23ahs amd64 [installed,automatic]
nvidia-egl-icd/mx,now 535.216.03-3~mx23ahs amd64 [installed,automatic]
nvidia-fs-dkms/stable,now 2.13.5~11.8.0-5~deb12u1 amd64 [installed]
nvidia-installer-cleanup/stable,now 20220217+3~deb12u1 amd64 [installed,automatic]
nvidia-kernel-common/stable,now 20220217+3~deb12u1 amd64 [installed,automatic]
nvidia-kernel-dkms/mx,now 535.216.03-3~mx23ahs amd64 [installed,automatic]
nvidia-kernel-source/mx,now 535.216.03-3~mx23ahs amd64 [installed]
nvidia-kernel-support/mx,now 535.216.03-3~mx23ahs amd64 [installed,automatic]
nvidia-legacy-check/mx,now 535.216.03-3~mx23ahs amd64 [installed,automatic]
nvidia-modprobe/stable,now 535.161.07-1~deb12u1 amd64 [installed,automatic]
nvidia-opencl-dev/stable,now 11.8.89~11.8.0-5~deb12u1 amd64 [installed,automatic]
nvidia-profiler/stable,now 11.8.87~11.8.0-5~deb12u1 amd64 [installed,automatic]
nvidia-settings/stable,now 535.171.04-1~deb12u1 amd64 [installed]
nvidia-smi/mx,now 535.216.03-3~mx23ahs amd64 [installed]
nvidia-support/stable,now 20220217+3~deb12u1 amd64 [installed,automatic]
nvidia-vdpau-driver/mx,now 535.216.03-3~mx23ahs amd64 [installed,automatic]
xserver-xorg-video-nvidia/mx,now 535.216.03-3~mx23ahs amd64 [installed,automatic]
Already tried to wipe the drivers/modules and reinstall from scratch: always the same issue...

Any siggestion is welcome! If you need additional information, just ask.
Thank in advance for hits
Bye for now
A

User avatar
Eadwine Rose
Administrator
Posts: 14423
Joined: Wed Jul 12, 2006 2:10 am

Re: Nvidia nvfs dkms issue

#2 Post by Eadwine Rose »

According to the forum rules (please read): Please provide full Quick System Info from the menu, use copy for forum button, no edits.

LiveUSB version is OK if needed.
MX-23.6_x64 July 31 2023 * 6.1.0-34amd64 ext4 Xfce 4.20.0 * 8-core AMD Ryzen 7 2700
Asus TUF B450-Plus Gaming UEFI * Asus GTX 1050 Ti Nvidia 535.216.01 * 2x16Gb DDR4 2666 Kingston HyperX Predator
Samsung 870EVO * Samsung S24D330 & P2250 * HP Envy 5030

User avatar
Antro
Posts: 27
Joined: Sat Oct 30, 2021 7:31 am

Re: Nvidia nvfs dkms issue

#3 Post by Antro »

Code: Select all

System:    Kernel: 6.11.10-amd64 [6.11.10-1~mx23ahs] x86_64 bits: 64 compiler: N/A 
           parameters: BOOT_IMAGE=/boot/vmlinuz-6.11.10-amd64 root=UUID=<filter> ro quiet splash 
           Desktop: Xfce 4.20.0 tk: Gtk 3.24.38 info: xfce4-panel wm: xfwm 4.20.0 vt: 7 
           dm: LightDM 1.32.0 Distro: MX-23.6_x64 Libretto February 15  2020 
           base: Debian GNU/Linux 12 (bookworm) 
Machine:   Type: Desktop System: LENOVO product: 30BBS1PP00 v: ThinkStation P720 serial: <filter> 
           Chassis: type: 3 serial: <filter> 
           Mobo: LENOVO model: 1037 v: SDK0Q40104 WIN 3305669354006 serial: <filter> 
           UEFI-[Legacy]: LENOVO v: S04KT40A date: 06/10/2019 
CPU:       Info: 2x Quad Core model: Intel Xeon Silver 4112 bits: 64 type: MT MCP SMP 
           arch: Skylake family: 6 model-id: 55 (85) stepping: 4 microcode: 2007006 cache: 
           L2: 16.5 MiB 
           flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx bogomips: 83200 
           Speed: 800 MHz min/max: 800/3000 MHz Core speeds (MHz): 1: 800 2: 800 3: 800 4: 800 
           5: 800 6: 800 7: 800 8: 800 9: 800 10: 800 11: 800 12: 800 13: 800 14: 800 15: 800 
           16: 800 
           Vulnerabilities: Type: gather_data_sampling mitigation: Microcode 
           Type: itlb_multihit status: KVM: VMX disabled 
           Type: l1tf mitigation: PTE Inversion; VMX: conditional cache flushes, SMT vulnerable 
           Type: mds mitigation: Clear CPU buffers; SMT vulnerable 
           Type: meltdown mitigation: PTI 
           Type: mmio_stale_data mitigation: Clear CPU buffers; SMT vulnerable 
           Type: reg_file_data_sampling status: Not affected 
           Type: retbleed mitigation: IBRS 
           Type: spec_rstack_overflow status: Not affected 
           Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via prctl 
           Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer sanitization 
           Type: spectre_v2 mitigation: IBRS; IBPB: conditional; STIBP: conditional; RSB filling; 
           PBRSB-eIBRS: Not affected; BHI: Not affected 
           Type: srbds status: Not affected 
           Type: tsx_async_abort mitigation: Clear CPU buffers; SMT vulnerable 
Graphics:  Device-1: NVIDIA GP104GL [Quadro P4000] driver: nvidia v: 535.216.03 bus-ID: 18:00.0 
           chip-ID: 10de:1bb1 class-ID: 0300 
           Display: x11 server: X.Org 1.21.1.7 compositor: xfwm4 v: 4.20.0 driver: loaded: nvidia 
           display-ID: :0.0 screens: 1 
           Screen-1: 0 s-res: 3840x1200 s-dpi: 96 s-size: 1017x318mm (40.0x12.5") 
           s-diag: 1066mm (42") 
           Monitor-1: DP-4 res: 1920x1200 hz: 60 dpi: 94 size: 518x324mm (20.4x12.8") 
           diag: 611mm (24.1") 
           Monitor-2: DP-6 res: 1920x1200 hz: 60 dpi: 89 size: 546x352mm (21.5x13.9") 
           diag: 650mm (25.6") 
           OpenGL: renderer: Quadro P4000/PCIe/SSE2 v: 4.6.0 NVIDIA 535.216.03 direct render: Yes 
Audio:     Device-1: Intel vendor: Lenovo driver: snd_hda_intel v: kernel bus-ID: 00:1f.3 
           chip-ID: 8086:a1f0 class-ID: 0403 
           Device-2: NVIDIA GP104 High Definition Audio driver: snd_hda_intel v: kernel 
           bus-ID: 18:00.1 chip-ID: 10de:10f0 class-ID: 0403 
           Sound Server-1: ALSA v: k6.11.10-amd64 running: yes 
           Sound Server-2: PulseAudio v: 16.1 running: yes 
Network:   Device-1: Intel Ethernet I219-LM vendor: Lenovo driver: e1000e v: kernel port: 0780 
           bus-ID: 00:1f.6 chip-ID: 8086:15b9 class-ID: 0200 
           IF: eth1 state: down mac: <filter> 
           Device-2: Intel I210 Gigabit Network vendor: Lenovo driver: igb v: kernel port: 2000 
           bus-ID: 04:00.0 chip-ID: 8086:1533 class-ID: 0200 
           IF: eth0 state: up speed: 1000 Mbps duplex: full mac: <filter> 
Drives:    Local Storage: total: 12.8 TiB used: 5.71 TiB (44.6%) 
           SMART Message: Unable to run smartctl. Root privileges required. 
           ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Western Digital model: WD Blue SN570 500GB 
           size: 465.76 GiB block-size: physical: 4096 B logical: 4096 B speed: 31.6 Gb/s lanes: 4 
           type: SSD serial: <filter> rev: 234110WD temp: 24.9 C scheme: MBR 
           ID-2: /dev/sda maj-min: 8:0 vendor: Samsung model: SSD 840 PRO Series size: 476.94 GiB 
           block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s type: SSD serial: <filter> 
           rev: 6B0Q scheme: MBR 
           SMART Message: Unknown smartctl error. Unable to generate data. 
           ID-3: /dev/sdb maj-min: 8:16 vendor: Western Digital model: WD10JPVT-22A1YT0 
           size: 931.51 GiB block-size: physical: 4096 B logical: 512 B speed: 3.0 Gb/s type: HDD 
           rpm: 5400 serial: <filter> rev: 1A01 scheme: GPT 
           SMART Message: Unknown smartctl error. Unable to generate data. 
           ID-4: /dev/sdc maj-min: 8:32 vendor: Seagate model: ST4000NM0033-9ZM170 size: 3.64 TiB 
           block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s type: HDD rpm: 7200 
           serial: <filter> rev: EN09 scheme: GPT 
           SMART Message: Unknown smartctl error. Unable to generate data. 
           ID-5: /dev/sdd maj-min: 8:48 vendor: Western Digital model: WD30EZRZ-00WN9B0 
           size: 2.73 TiB block-size: physical: 4096 B logical: 512 B speed: 6.0 Gb/s type: HDD 
           rpm: 5400 serial: <filter> rev: 0A80 scheme: GPT 
           SMART Message: Unknown smartctl error. Unable to generate data. 
           ID-6: /dev/sde maj-min: 8:64 vendor: Western Digital model: WD10EZEX-08WN4A0 
           size: 931.51 GiB block-size: physical: 4096 B logical: 512 B speed: 6.0 Gb/s type: HDD 
           rpm: 7200 serial: <filter> rev: 1A02 scheme: GPT 
           SMART Message: Unknown smartctl error. Unable to generate data. 
           ID-7: /dev/sdf maj-min: 8:80 vendor: Seagate model: ST2000LM015-2E8174 size: 1.82 TiB 
           block-size: physical: 4096 B logical: 512 B speed: 6.0 Gb/s type: HDD rpm: 5400 
           serial: <filter> rev: 0001 scheme: GPT 
           SMART Message: Unknown smartctl error. Unable to generate data. 
           ID-8: /dev/sdg maj-min: 8:96 type: USB vendor: Lexar model: USB Flash Drive 
           size: 57.6 GiB block-size: physical: 512 B logical: 512 B type: SSD serial: <filter> 
           rev: 3.00 scheme: MBR 
           SMART Message: Unknown USB bridge. Flash drive/Unsupported enclosure? 
           ID-9: /dev/sdh maj-min: 8:112 type: USB vendor: Toshiba model: DT01ACA200 
           size: 1.82 TiB block-size: physical: 512 B logical: 512 B type: HDD rpm: 7200 
           serial: <filter> rev: 0000 scheme: MBR 
           SMART Message: A mandatory SMART command failed. Various possible causes. 
Partition: ID-1: / raw-size: 474.91 GiB size: 466.38 GiB (98.20%) used: 178.6 GiB (38.3%) fs: ext4 
           block-size: 4096 B dev: /dev/sda1 maj-min: 8:1 
           ID-2: /home raw-size: 465.76 GiB size: 457.38 GiB (98.20%) used: 242.68 GiB (53.1%) 
           fs: ext4 block-size: 4096 B dev: /dev/nvme0n1p1 maj-min: 259:1 
Swap:      Alert: No swap data was found. 
Sensors:   System Temperatures: cpu: 35.0 C mobo: N/A gpu: nvidia temp: 42 C 
           Fan Speeds (RPM): N/A gpu: nvidia fan: 46% 
Repos:     Packages: note: see --pkg apt: 3404 lib: 2021 flatpak: 0 
           No active apt repos in: /etc/apt/sources.list 
           Active apt repos in: /etc/apt/sources.list.d/debian-stable-updates.list 
           1: deb http://deb.debian.org/debian/ bookworm-updates non-free contrib main
           Active apt repos in: /etc/apt/sources.list.d/debian.list 
           1: deb http://deb.debian.org/debian/ bookworm main contrib non-free non-free-firmware
           2: deb http://security.debian.org/debian-security/ bookworm-security non-free-firmware non-free contrib main
           3: deb-src http://deb.debian.org/debian/ bookworm non-free contrib main
           4: deb http://deb.debian.org/debian/ bookworm-backports non-free contrib main non-free-firmware
           Active apt repos in: /etc/apt/sources.list.d/dropbox.list 
           1: deb [arch=i386,amd64 signed-by=/etc/apt/keyrings/dropbox.asc] http://linux.dropbox.com/debian/ bookworm main
           No active apt repos in: /etc/apt/sources.list.d/ethereum-buster.list 
           Active apt repos in: /etc/apt/sources.list.d/mono-official-stable.list 
           1: deb [signed-by=/usr/share/keyrings/mono-official-archive-keyring.gpg] https://download.mono-project.com/repo/debian/ stable-buster main
           Active apt repos in: /etc/apt/sources.list.d/multimedia.list 
           1: deb https://www.deb-multimedia.org/ bookworm non-free main
           2: deb https://www.deb-multimedia.org/ stable non-free main
           3: deb https://www.deb-multimedia.org/ bookworm-backports main
           4: deb https://www.deb-multimedia.org/ stable-backports main
           Active apt repos in: /etc/apt/sources.list.d/mx.list 
           1: deb http://mxrepo.com/mx/repo/ bookworm main non-free
           2: deb http://mxrepo.com/mx/repo/ bookworm ahs
           Active apt repos in: /etc/apt/sources.list.d/oneAPI.list 
           1: deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi/ all main
           No active apt repos in: /etc/apt/sources.list.d/tvheadend-tvheadend.list 
           No active apt repos in: /etc/apt/sources.list.d/various.list 
Info:      Processes: 374 Uptime: 4h 0m wakeups: 1 Memory: 15.25 GiB used: 4.02 GiB (26.4%) 
           Init: SysVinit v: 3.06 runlevel: 5 default: graphical.target tool: systemctl Compilers: 
           gcc: 12.2.0 alt: 11/12 Client: shell wrapper v: 5.2.15-release inxi: 3.3.06 
Boot Mode: BIOS (legacy, CSM, MBR)

User avatar
Antro
Posts: 27
Joined: Sat Oct 30, 2021 7:31 am

Re: Nvidia nvfs dkms issue

#4 Post by Antro »

Brand new liquorix 6-14-3 header... same error again

Code: Select all

In file included from nvfs-dma.c:31:
nvfs-dma.h:31:10: fatal error: linux/blk-mq-pci.h: No such file or directory
   31 | #include <linux/blk-mq-pci.h>
      |          ^~~~~~~~~~~~~~~~~~~~
compilation terminated.
make[4]: *** [/usr/src/linux-headers-6.14.2-1-liquorix-amd64/scripts/Makefile.build:207: nvfs-dma.o] Error 1

User avatar
Stevo
Developer
Posts: 14415
Joined: Fri Dec 15, 2006 7:07 pm

Re: Nvidia nvfs dkms issue

#5 Post by Stevo »

Edit: Oh, I get it. It's Bookworm's nvidia-fs-dkms package having the problems. Let's see if it can be updated.
Description: NVIDIA file-system - nvidia-fs.ko kernel driver
GPUDirect Storage (GDS) enables a direct data path for direct memory access
(DMA) transfers between GPU memory and storage, which avoids a bounce buffer
through the CPU.
.
This package builds the nvidia-fs.ko kernel driver.
Can you uninstall it in the meantime?

Oh, good grief. The sources are truly gigantic in size. Are you using CUDA?


Hmmm

Code: Select all

$ locate linux/blk-mq-pci.h

/usr/src/linux-headers-6.1.0-32-common/include/linux/blk-mq-pci.h
/usr/src/linux-headers-6.1.0-33-common/include/linux/blk-mq-pci.h
/usr/src/linux-headers-6.13.5-1-liquorix-amd64/include/linux/blk-mq-pci.h
/usr/src/linux-headers-6.6.7-common/include/linux/blk-mq-pci.h
Nope, not in the 6.14 kernel headers...but, the builds is still succeeding for me :confused: :

Code: Select all

Building module:
Cleaning build area...
env NV_VERBOSE=1 make -j16 modules KERNEL_UNAME=6.14.3-1-liquorix-amd64......................
Cleaning build area...

nvidia-current.ko.zst:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/6.14.3-1-liquorix-amd64/updates/dkms/

nvidia-current-modeset.ko.zst:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/6.14.3-1-liquorix-amd64/updates/dkms/

nvidia-current-drm.ko.zst:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/6.14.3-1-liquorix-amd64/updates/dkms/

nvidia-current-uvm.ko.zst:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/6.14.3-1-liquorix-amd64/updates/dkms/

nvidia-current-peermem.ko.zst:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/6.14.3-1-liquorix-amd64/updates/dkms/
depmod...
Sign command: /lib/modules/6.14.3-1-liquorix-amd64/build/scripts/sign-file
Binary /lib/modules/6.14.3-1-liquorix-amd64/build/scripts/sign-file not found, modules won't be signed
MXPI = MX Package Installer
QSI = Quick System Info from menu
The MX Test repository is mostly backports; not the same as Debian testing

User avatar
Antro
Posts: 27
Joined: Sat Oct 30, 2021 7:31 am

Re: Nvidia nvfs dkms issue

#6 Post by Antro »

Stevo wrote: Tue Apr 22, 2025 1:33 pm Edit: Oh, I get it. It's Bookworm's nvidia-fs-dkms package having the problems. Let's see if it can be updated.
Description: NVIDIA file-system - nvidia-fs.ko kernel driver
GPUDirect Storage (GDS) enables a direct data path for direct memory access
(DMA) transfers between GPU memory and storage, which avoids a bounce buffer
through the CPU.
.
This package builds the nvidia-fs.ko kernel driver.
Can you uninstall it in the meantime?
Yes I can! Uninstalled and rebooted: no apparently issue. I gave a chance to kernel 6.12.21: without nvidia-fs dkms compilation works like a charm :cool:

Code: Select all

$ uname -a
Linux htpc 6.12.21-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.12.21-1~mx23ahs (2025-04-08) x86_64 GNU/Linux
The issue is definitively in that part of the code: I guess that without that package probably all the kernel releases could be installed.
Stevo wrote: Tue Apr 22, 2025 1:33 pm Oh, good grief. The sources are truly gigantic in size. Are you using CUDA?
:embarrassed: Yesss... ffmpeg with CUDA support is an amazing rocket both encoding and decoding.

AFAIK, the nvidia-fs should speed-up-orchestrate I/O transfers from storage to GPU. It would be nice to have it working. I could make some benchmark: Kernel 6.11 and nvidia-fs vs. Kernel 6.12 and no nvidia-fs, could be worth a try, even if we are dealing with an old CUDA release 11. Could be nice to have in place CUDA12 too, but I understand that this upgrade is quite huge and complex...


However, now I'm an happy Kernel 6.12.21 user, thanks again Stevo

Post Reply

Return to “Software / Configuration”