Page 1 of 1

Tuxedo freezes with many gpu errors

Posted: Sun May 18, 2025 9:56 am
by IoannisTsoulos
Hi,
I have installed Mx linux 23.6 on a Tuxedo Pulse Gen 3 and the system freezes with the following messages on syslog:
2025-05-18T16:46:11.871076+03:00 tux kernel: amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
2025-05-18T16:46:12.125062+03:00 tux kernel: amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
2025-05-18T16:46:12.379058+03:00 tux kernel: amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
2025-05-18T16:46:22.616125+03:00 tux kernel: amdgpu 0000:03:00.0: [drm] *ERROR* [CRTC:79:crtc-0] flip_done timed out
2025-05-18T16:46:22.872059+03:00 tux kernel: amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
2025-05-18T16:46:23.127057+03:00 tux kernel: amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
2025-05-18T16:46:23.383057+03:00 tux kernel: amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
2025-05-18T16:46:23.639063+03:00 tux kernel: amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
2025-05-18T16:46:23.895058+03:00 tux kernel: amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
2025-05-18T16:46:24.149034+03:00 tux kernel: amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
2025-05-18T16:46:24.403058+03:00 tux kernel: amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
2025-05-18T16:46:24.658065+03:00 tux kernel: amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
2025-05-18T16:46:24.912059+03:00 tux kernel: amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
2025-05-18T16:46:25.165059+03:00 tux kernel: amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
2025-05-18T16:46:25.418055+03:00 tux kernel: amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
2025-05-18T16:46:25.658048+03:00 tux kernel: amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
2025-05-18T16:46:35.928548+03:00 tux kernel: amdgpu 0000:03:00.0: [drm] *ERROR* flip_done timed out
2025-05-18T16:46:35.928575+03:00 tux kernel: amdgpu 0000:03:00.0: [drm] *ERROR* [CRTC:79:crtc-0] commit wait timed out
2025-05-18T16:46:46.168513+03:00 tux kernel: amdgpu 0000:03:00.0: [drm] *ERROR* flip_done timed out
2025-05-18T16:46:46.168546+03:00 tux kernel: amdgpu 0000:03:00.0: [drm] *ERROR* [PLANE:58:plane-3] commit wait timed out
2025-05-18T


Also
here is the info for my system

Code: Select all

System:
  Kernel: 6.14.6-1-liquorix-amd64 [6.14-8~mx23ahs] arch: x86_64 bits: 64 compiler: gcc v: 12.2.0 parameters: audit=0
    intel_pstate=disable amd_pstate=disable BOOT_IMAGE=/boot/vmlinuz-6.14.6-1-liquorix-amd64
    root=UUID=<filter> ro quiet splash init=/lib/systemd/systemd
  Desktop: KDE Plasma v: 5.27.5 tk: Qt v: 5.15.8 wm: kwin_x11 vt: 7 dm: SDDM
    Distro: MX-23.6_KDE_x64 Libretto April 13 2025 base: Debian GNU/Linux 12 (bookworm)
Machine:
  Type: Laptop System: TUXEDO product: TUXEDO Pulse 14 Gen3 v: Version 1.0
    serial: <superuser required> Chassis: type: 10 serial: <superuser required>
  Mobo: NB05 model: R14FA1 v: Version 1.0 serial: <superuser required> UEFI: American Megatrends
    LLC. v: 8.12 date: 02/19/2024
Battery:
  ID-1: BAT0 charge: 25.8 Wh (43.5%) condition: 59.3/61.4 Wh (96.6%) volts: 11.2 min: 11.8
    model: Standard SR Real Battery type: Li-ion serial: <filter> status: discharging
CPU:
  Info: model: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics bits: 64 type: MT MCP arch: Zen 4 gen: 5
    level: v4 note: check built: 2022+ process: TSMC n5 (5nm) family: 0x19 (25) model-id: 0x74 (116)
    stepping: 1 microcode: 0xA704104
  Topology: cpus: 1x cores: 8 tpc: 2 threads: 16 smt: enabled cache: L1: 512 KiB
    desc: d-8x32 KiB; i-8x32 KiB L2: 8 MiB desc: 8x1024 KiB L3: 16 MiB desc: 1x16 MiB
  Speed (MHz): avg: 1619 high: 2200 min/max: 1600/3800 boost: disabled scaling:
    driver: acpi-cpufreq governor: ondemand cores: 1: 1422 2: 1278 3: 1751 4: 2196 5: 1317 6: 1600
    7: 1350 8: 2200 9: 1600 10: 1600 11: 1600 12: 1600 13: 1600 14: 1600 15: 1600 16: 1600
    bogomips: 121372
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
  Vulnerabilities:
  Type: gather_data_sampling status: Not affected
  Type: ghostwrite status: Not affected
  Type: itlb_multihit status: Not affected
  Type: l1tf status: Not affected
  Type: mds status: Not affected
  Type: meltdown status: Not affected
  Type: mmio_stale_data status: Not affected
  Type: old_microcode status: Not affected
  Type: reg_file_data_sampling status: Not affected
  Type: retbleed status: Not affected
  Type: spec_rstack_overflow mitigation: Safe RET
  Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via prctl
  Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer sanitization
  Type: spectre_v2 mitigation: Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on;
    PBRSB-eIBRS: Not affected; BHI: Not affected
  Type: srbds status: Not affected
  Type: tsx_async_abort status: Not affected
Graphics:
  Device-1: AMD Phoenix1 vendor: IP3 Tech Phoenix driver: amdgpu v: kernel arch: RDNA-3
    code: Phoenix process: TSMC n4 (4nm) built: 2022+ pcie: gen: 4 speed: 16 GT/s lanes: 16 ports:
    active: eDP-1 empty: DP-1, DP-2, DP-3, DP-4, DP-5, DP-6, HDMI-A-1, Writeback-1 bus-ID: 03:00.0
    chip-ID: 1002:15bf class-ID: 0300 temp: 35.0 C
  Device-2: Microdia USB 2.0 Camera type: USB driver: uvcvideo bus-ID: 1-4:3 chip-ID: 0c45:636c
    class-ID: 0e02
  Display: x11 server: X.Org v: 1.21.1.7 with: Xwayland v: 22.1.9 compositor: kwin_x11 driver: X:
    loaded: amdgpu unloaded: fbdev,modesetting,radeon,vesa dri: radeonsi gpu: amdgpu display-ID: :0
    screens: 1
  Screen-1: 0 s-res: 1920x1200 s-dpi: 96 s-size: 507x317mm (19.96x12.48") s-diag: 598mm (23.54")
  Monitor-1: eDP-1 mapped: eDP model: TL140ADXP24-0 built: 2023 res: 1920x1200 hz: 120 dpi: 163
    gamma: 1.2 size: 300x190mm (11.81x7.48") diag: 355mm (14") ratio: 16:10 modes: max: 2880x1800
    min: 640x480
  API: OpenGL v: 4.6 Mesa 24.2.8-1mx23ahs renderer: AMD Radeon 780M (radeonsi gfx1103_r1 LLVM
    15.0.6 DRM 3.61 6.14.6-1-liquorix-amd64) direct-render: Yes
Audio:
  Device-1: AMD Rembrandt Radeon High Definition Audio driver: snd_hda_intel v: kernel pcie: gen: 4
    speed: 16 GT/s lanes: 16 bus-ID: 03:00.1 chip-ID: 1002:1640 class-ID: 0403
  Device-2: AMD Family 17h/19h/1ah HD Audio vendor: Conexant Systems driver: snd_hda_intel
    v: kernel pcie: gen: 4 speed: 16 GT/s lanes: 16 bus-ID: 03:00.6 chip-ID: 1022:15e3 class-ID: 0403
  API: ALSA v: k6.14.6-1-liquorix-amd64 status: kernel-api tools: alsamixer,amixer
  Server-1: PipeWire v: 1.0.0 status: active with: 1: pipewire-pulse status: active
    2: wireplumber status: active 3: pipewire-alsa type: plugin 4: pw-jack type: plugin
    tools: pactl,pw-cat,pw-cli,wpctl
Network:
  Device-1: Intel Wi-Fi 6E AX210/AX1675 2x2 [Typhoon Peak] driver: iwlwifi v: kernel modules: wl
    pcie: gen: 2 speed: 5 GT/s lanes: 1 bus-ID: 02:00.0 chip-ID: 8086:2725 class-ID: 0280
  IF: wlan0 state: up mac: <filter>
Bluetooth:
  Device-1: Intel AX210 Bluetooth type: USB driver: btusb v: 0.8 bus-ID: 1-5:4 chip-ID: 8087:0032
    class-ID: e001
  Report: hciconfig ID: hci0 rfk-id: 1 state: up address: <filter>
  Info: acl-mtu: 1021:4 sco-mtu: 96:6 link-policy: rswitch sniff link-mode: peripheral accept
    service-classes: rendering, capturing, object transfer, audio, telephony
Drives:
  Local Storage: total: 931.51 GiB used: 39.96 GiB (4.3%)
  SMART Message: Unable to run smartctl. Root privileges required.
  ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Samsung model: SSD 990 EVO 1TB size: 931.51 GiB
    block-size: physical: 512 B logical: 512 B speed: 63.2 Gb/s lanes: 4 type: SSD serial: <filter>
    rev: 0B2QKXJ7 temp: 33.9 C scheme: GPT
Partition:
  ID-1: / raw-size: 931.26 GiB size: 915.57 GiB (98.31%) used: 39.96 GiB (4.4%) fs: ext4
    dev: /dev/nvme0n1p2 maj-min: 259:2
  ID-2: /boot/efi raw-size: 256 MiB size: 252 MiB (98.46%) used: 274 KiB (0.1%) fs: vfat
    dev: /dev/nvme0n1p1 maj-min: 259:1
Swap:
  Kernel: swappiness: 15 (default 60) cache-pressure: 100 (default)
  ID-1: swap-1 type: file size: 5 GiB used: 0 KiB (0.0%) priority: -2 file: /swap/swap
Sensors:
  System Temperatures: cpu: 38.0 C mobo: N/A gpu: amdgpu temp: 35.0 C
  Fan Speeds (RPM): cpu: 0
Repos:
  Packages: pm: dpkg pkgs: 2997 libs: 1546 tools: apt,apt-get,aptitude,nala pm: rpm pkgs: 0
    pm: flatpak pkgs: 0
  No active apt repos in: /etc/apt/sources.list
  Active apt repos in: /etc/apt/sources.list.d/debian-stable-updates.list
    1: deb http://deb.debian.org/debian bookworm-updates main contrib non-free non-free-firmware
  Active apt repos in: /etc/apt/sources.list.d/debian.list
    1: deb http://deb.debian.org/debian bookworm main contrib non-free non-free-firmware
    2: deb http://security.debian.org/debian-security bookworm-security main contrib non-free non-free-firmware
  Active apt repos in: /etc/apt/sources.list.d/insync.list
    1: deb [signed-by=/etc/apt/trusted.gpg.d/insynchq.gpg] http://apt.insync.io/debian bookworm non-free contrib
  Active apt repos in: /etc/apt/sources.list.d/mx.list
    1: deb http://ftp.cc.uoc.gr/mirrors/linux/mx/mx/repo/ bookworm main non-free
    2: deb http://ftp.cc.uoc.gr/mirrors/linux/mx/mx/repo/ bookworm ahs
Info:
  Processes: 386 Uptime: 9m wakeups: 9579 Memory: 25.25 GiB used: 3.79 GiB (15.0%) Init: systemd
  v: 252 target: graphical (5) default: graphical tool: systemctl Compilers: gcc: 12.2.0 alt: 12
  Client: shell wrapper v: 5.2.15-release inxi: 3.3.26
Boot Mode: UEFI

Re: Tuxedo freezes with many gpu errors

Posted: Sun May 18, 2025 10:12 am
by j2mcgreg
Tuxedo's own OS is based on Ubuntu, so it might require SystemD.
Edited to add that you might need a Liquorix kernel as well to better handle the Ryzen based chipset.

Re: Tuxedo freezes with many gpu errors

Posted: Sun May 18, 2025 11:41 am
by siamhie
IoannisTsoulos wrote: Sun May 18, 2025 9:56 am 2025-05-18T16:46:11.871076+03:00 tux kernel: amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
While searching for this error message, I came across this thread. [SOLVED] amdgpu framework [drm] dc_dmub_srv_log_diagnostic_data https://bbs.archlinux.org/viewtopic.php?id=302499


Open MX Boot Options and add this to the end of the kernel parameters field and try.

Code: Select all

amdgpu.dcdebugmask=0x10

Re: Tuxedo freezes with many gpu errors

Posted: Sun May 18, 2025 12:19 pm
by IoannisTsoulos
I have already followed their advice.
The laptop seems to be steady now but I have the following warnings in dmesg
[ 2.898835] amdgpu 0000:03:00.0: [drm] REG_WAIT timeout 1us * 1000 tries - dcn314_dsc_pg_control line:237
[ 2.901264] amdgpu 0000:03:00.0: [drm] REG_WAIT timeout 1us * 1000 tries - dcn314_dsc_pg_control line:245
[ 2.903693] amdgpu 0000:03:00.0: [drm] REG_WAIT timeout 1us * 1000 tries - dcn314_dsc_pg_control line:253
[ 2.906120] amdgpu 0000:03:00.0: [drm] REG_WAIT timeout 1us * 1000 tries - dcn314_dsc_pg_control line:261

Re: Tuxedo freezes with many gpu errors

Posted: Sun May 18, 2025 11:50 pm
by IoannisTsoulos
j2mcgreg wrote: Sun May 18, 2025 10:12 am Tuxedo's own OS is based on Ubuntu, so it might require SystemD.
Edited to add that you might need a Liquorix kernel as well to better handle the Ryzen based chipset.
I have already switched to systemd from mx boot options and I have installed tuxedo drivers and tuxedo control center. The laptop seems to be stable now with some dmesg warnings.