AMD GPU resets randomly

Help for Current Versions of MX
When asking for help, use Quick System Info from MX Tools. It will be properly formatted using the following steps.
1. Click on Quick System Info in MX Tools
2. Right click in your post and paste.
Message
Author
User avatar
Hooten
Posts: 67
Joined: Sat May 05, 2018 5:52 pm

AMD GPU resets randomly

#1 Post by Hooten »

Hello,

I have an issue with my PC for months now, trying to find a solution but after trying various solutions it's time to ask for your help.
While using my computer randomly the screen freezes for few seconds and I get black screen for a moment before prompts into login screen. It happens really randomly, either while gaming or just browsing on the web. I've trying multiple things like choosing different kernel versions, mesa drivers, changing browsers, changing gpu power profiles, underclocking my gpu, reinstalling MX Linux, changing distros (Manjanro for example), changing window managers (from i3 to xfce) and using systemd or not. Nothing helps and I'm trying to solve this issue for months now. They annoying thing is that this crash doesn't happen on Windows at all. :frown:

Here is my setup:

Code: Select all

System:
  Kernel: 6.11.5-1-liquorix-amd64 [6.11-9~mx23ahs] arch: x86_64 bits: 64 compiler: gcc v: 12.2.0 parameters: audit=0
    intel_pstate=disable BOOT_IMAGE=/vmlinuz-6.11.5-1-liquorix-amd64 root=UUID=<filter> ro quiet
    splash init=/lib/systemd/systemd
  Desktop: i3 v: 4.22 info: i3bar vt: 7 dm: LightDM v: 1.26.0 Distro: MX-23.4_ahs_x64 Libretto
    January 21 2024 base: Debian GNU/Linux 12 (bookworm)
Machine:
  Type: Desktop Mobo: ASRock model: X570 Phantom Gaming 4 serial: <superuser required>
    UEFI: American Megatrends v: P5.60 date: 01/18/2024
CPU:
  Info: model: AMD Ryzen 9 5900X bits: 64 type: MT MCP arch: Zen 3+ gen: 4 level: v3 note: check
    built: 2022 process: TSMC n6 (7nm) family: 0x19 (25) model-id: 0x21 (33) stepping: 2
    microcode: 0xA20120E
  Topology: cpus: 1x cores: 12 tpc: 2 threads: 24 smt: enabled cache: L1: 768 KiB desc: d-12x32
    KiB; i-12x32 KiB L2: 6 MiB desc: 12x512 KiB L3: 64 MiB desc: 2x32 MiB
  Speed (MHz): avg: 2029 high: 4775 min/max: 550/4951 boost: enabled scaling:
    driver: amd-pstate-epp governor: performance cores: 1: 4766 2: 4766 3: 4450 4: 550 5: 550 6: 550
    7: 4766 8: 4486 9: 4765 10: 550 11: 550 12: 3842 13: 550 14: 550 15: 550 16: 550 17: 550
    18: 550 19: 550 20: 550 21: 4775 22: 3848 23: 550 24: 550 bogomips: 177273
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
  Vulnerabilities:
  Type: gather_data_sampling status: Not affected
  Type: itlb_multihit status: Not affected
  Type: l1tf status: Not affected
  Type: mds status: Not affected
  Type: meltdown status: Not affected
  Type: mmio_stale_data status: Not affected
  Type: reg_file_data_sampling status: Not affected
  Type: retbleed status: Not affected
  Type: spec_rstack_overflow mitigation: Safe RET
  Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via prctl
  Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer sanitization
  Type: spectre_v2 mitigation: Retpolines; IBPB: conditional; IBRS_FW; STIBP: always-on; RSB
    filling; PBRSB-eIBRS: Not affected; BHI: Not affected
  Type: srbds status: Not affected
  Type: tsx_async_abort status: Not affected
Graphics:
  Device-1: AMD Navi 10 [Radeon RX 5600 OEM/5600 XT / 5700/5700 XT] vendor: Sapphire driver: amdgpu
    v: kernel arch: RDNA-1 code: Navi-1x process: TSMC n7 (7nm) built: 2019-20 pcie: gen: 4
    speed: 16 GT/s lanes: 16 ports: active: DP-1 empty: DP-2,DP-3,HDMI-A-1 bus-ID: 0a:00.0
    chip-ID: 1002:731f class-ID: 0300
  Display: x11 server: X.Org v: 1.21.1.7 with: Xwayland v: 22.1.9 driver: X: loaded: amdgpu
    dri: radeonsi gpu: amdgpu display-ID: :0 screens: 1
  Screen-1: 0 s-res: 2560x1440 s-dpi: 96 s-size: 677x381mm (26.65x15.00") s-diag: 777mm (30.58")
  Monitor-1: DP-1 mapped: DisplayPort-0 model: BenQ EX2710Q serial: <filter> built: 2022
    res: 2560x1440 dpi: 109 gamma: 1.2 size: 597x336mm (23.5x13.23") diag: 685mm (27") ratio: 16:9
    modes: max: 2560x1440 min: 720x400
  API: OpenGL v: 4.6 Mesa 24.2.2-1~mx23ahs renderer: AMD Radeon RX 5600 XT (radeonsi navi10 LLVM
    15.0.6 DRM 3.59 6.11.5-1-liquorix-amd64) direct-render: Yes
Audio:
  Device-1: AMD Navi 10 HDMI Audio driver: snd_hda_intel v: kernel bus-ID: 3-6:2 pcie:
    chip-ID: 08bb:2902 gen: 4 speed: 16 GT/s class-ID: 0300 lanes: 16 bus-ID: 0a:00.1
    chip-ID: 1002:ab38 class-ID: 0403
  Device-2: AMD Starship/Matisse HD Audio vendor: ASRock driver: snd_hda_intel v: kernel pcie:
    gen: 4 speed: 16 GT/s lanes: 16 bus-ID: 0c:00.4 chip-ID: 1022:1487 class-ID: 0403
  Device-3: Texas Instruments PCM2902 Audio Codec type: USB
    driver: hid-generic,snd-usb-audio,usbhid
  API: ALSA v: k6.11.5-1-liquorix-amd64 status: kernel-api tools: alsamixer,amixer
  Server-1: JACK v: 1.9.21 status: off tools: jack_control,qjackctl
  Server-2: PipeWire v: 1.0.0 status: active with: 1: pipewire-pulse status: active
    2: wireplumber status: active 3: pipewire-alsa type: plugin 4: pw-jack type: plugin
    tools: pactl,pw-cat,pw-cli,wpctl
Network:
  Device-1: Intel I211 Gigabit Network vendor: ASRock driver: igb v: kernel pcie: gen: 1
    speed: 2.5 GT/s lanes: 1 port: f000 bus-ID: 04:00.0 chip-ID: 8086:1539 class-ID: 0200
  IF: eth0 state: up speed: 1000 Mbps duplex: full mac: <filter>
  IF-ID-1: ham0 state: unknown speed: 10000 Mbps duplex: full mac: <filter>
Drives:
  Local Storage: total: 5.22 TiB used: 3.47 TiB (66.5%)
  SMART Message: Unable to run smartctl. Root privileges required.
  ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Seagate model: BarraCuda Q5 ZP2000CV30001
    size: 1.82 TiB block-size: physical: 512 B logical: 512 B speed: 31.6 Gb/s lanes: 4 type: SSD
    serial: <filter> rev: SU5SH017 temp: 39.9 C scheme: GPT
  ID-2: /dev/sda maj-min: 8:0 vendor: Kingston model: SHFS37A240G size: 223.57 GiB block-size:
    physical: 512 B logical: 512 B speed: 6.0 Gb/s type: SSD serial: <filter> rev: BBF0 scheme: GPT
  ID-3: /dev/sdb maj-min: 8:16 vendor: Western Digital model: WDS500G2B0A-00SM50 size: 465.76 GiB
    block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s type: SSD serial: <filter> rev: 20WD
    scheme: MBR
  ID-4: /dev/sdc maj-min: 8:32 vendor: Western Digital model: WD20EFRX-68EUZN0 size: 1.82 TiB
    block-size: physical: 4096 B logical: 512 B speed: 6.0 Gb/s type: HDD rpm: 5400 serial: <filter>
    rev: 0A82 scheme: MBR
  ID-5: /dev/sdd maj-min: 8:48 type: USB vendor: Toshiba model: MQ01ABD100 size: 931.51 GiB
    block-size: physical: 4096 B logical: 512 B type: HDD rpm: 5400 serial: <filter> scheme: MBR
Partition:
  ID-1: / raw-size: 1.82 TiB size: 1.79 TiB (98.37%) used: 893.84 GiB (48.8%) fs: ext4
    dev: /dev/dm-0 maj-min: 253:0 mapped: luks-<filter>
  ID-2: /boot raw-size: 1024 MiB size: 973.4 MiB (95.06%) used: 515 MiB (52.9%) fs: ext4
    dev: /dev/nvme0n1p2 maj-min: 259:2
  ID-3: /boot/efi raw-size: 256 MiB size: 252 MiB (98.46%) used: 274 KiB (0.1%) fs: vfat
    dev: /dev/nvme0n1p1 maj-min: 259:1
Swap:
  Kernel: swappiness: 15 (default 60) cache-pressure: 100 (default)
  ID-1: swap-1 type: file size: 6 GiB used: 768 KiB (0.0%) priority: -2 file: /swap/swap
Sensors:
  System Temperatures: cpu: 47.5 C mobo: 40.0 C gpu: amdgpu temp: 77.0 C mem: 74.0 C
  Fan Speeds (RPM): fan-1: 842 fan-2: 1717 fan-3: 1702 fan-4: 740 fan-5: 0 fan-6: 0 fan-7: 0
    gpu: amdgpu fan: 862
Repos:
  Packages: 2973 pm: dpkg pkgs: 2933 libs: 1606 tools: apt,apt-get,aptitude,nala,synaptic pm: rpm
    pkgs: 0 pm: flatpak pkgs: 40
  No active apt repos in: /etc/apt/sources.list
  No active apt repos in: /etc/apt/sources.list.d/amdgpu-proprietary.list
  Active apt repos in: /etc/apt/sources.list.d/amdgpu.list
    1: deb https://repo.radeon.com/amdgpu/6.1.3/ubuntu focal main
  Active apt repos in: /etc/apt/sources.list.d/brave-browser-release.list
    1: deb [arch=amd64 signed-by=/usr/share/keyrings/brave-browser-archive-keyring.gpg] https://brave-browser-apt-release.s3.brave.com/ stable main
  Active apt repos in: /etc/apt/sources.list.d/debian-stable-updates.list
    1: deb http://deb.debian.org/debian bookworm-updates main contrib non-free non-free-firmware
  Active apt repos in: /etc/apt/sources.list.d/debian.list
    1: deb http://deb.debian.org/debian bookworm main contrib non-free non-free-firmware
    2: deb http://security.debian.org/debian-security bookworm-security main contrib non-free non-free-firmware
  Active apt repos in: /etc/apt/sources.list.d/haguichi-debian.list
    1: deb [signed-by=/usr/share/keyrings/haguichi-debian.gpg] http://ppa.launchpad.net/ztefn/haguichi-debian/ubuntu bionic main
  Active apt repos in: /etc/apt/sources.list.d/kxstudio-debian-ppas-2.list
    1: deb http://ppa.launchpad.net/kxstudio-debian/libs/ubuntu focal main
    2: deb http://ppa.launchpad.net/kxstudio-debian/plugins/ubuntu focal main
    3: deb http://ppa.launchpad.net/kxstudio-debian/apps/ubuntu focal main
    4: deb http://ppa.launchpad.net/kxstudio-debian/kxstudio/ubuntu focal main
  Active apt repos in: /etc/apt/sources.list.d/kxstudio-debian-ppas.list
    1: deb http://ppa.launchpad.net/kxstudio-debian/libs/ubuntu bionic main
    2: deb http://ppa.launchpad.net/kxstudio-debian/music/ubuntu bionic main
    3: deb http://ppa.launchpad.net/kxstudio-debian/plugins/ubuntu bionic main
    4: deb http://ppa.launchpad.net/kxstudio-debian/apps/ubuntu bionic main
    5: deb http://ppa.launchpad.net/kxstudio-debian/kxstudio/ubuntu bionic main
  Active apt repos in: /etc/apt/sources.list.d/mx.list
    1: deb http://ftp.cc.uoc.gr/mirrors/linux/mx/mx/repo/ bookworm main non-free
    2: deb http://ftp.cc.uoc.gr/mirrors/linux/mx/mx/repo/ bookworm ahs
  Active apt repos in: /etc/apt/sources.list.d/rocm.list
    1: deb [arch=amd64] https://repo.radeon.com/rocm/apt/6.1.3 focal main
  Active apt repos in: /etc/apt/sources.list.d/signal-xenial.list
    1: deb [arch=amd64 signed-by=/usr/share/keyrings/signal-desktop-keyring.gpg] https://updates.signal.org/desktop/apt xenial main
  Active apt repos in: /etc/apt/sources.list.d/spotify.list
    1: deb http://repository.spotify.com stable non-free
  Active apt repos in: /etc/apt/sources.list.d/zerotier.list
    1: deb [signed-by=/usr/share/keyrings/zerotier-debian-package-key.gpg] http://download.zerotier.com/debian/bookworm bookworm main
Info:
  Processes: 495 Uptime: 14m wakeups: 0 Memory: 31.27 GiB used: 15.33 GiB (49.0%) Init: systemd
  v: 252 target: graphical (5) default: graphical tool: systemctl Compilers: gcc: 12.2.0 alt: 12
  Client: shell wrapper v: 5.2.15-release inxi: 3.3.26
Boot Mode: UEFI
and here is a syslog when the crash occurs:

Code: Select all

2024-11-02T14:09:01.138546+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: ring gfx_0.0.0 timeout, signaled seq=3413954, emitted seq=3413956
2024-11-02T14:09:01.138554+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: Process information: process VNGame.exe pid 4791 thread dxvk-submit pid 5669
2024-11-02T14:09:01.138556+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: GPU reset begin!
2024-11-02T14:09:05.138544+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: failed to suspend display audio
2024-11-02T14:09:05.303539+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: Dumping IP State
2024-11-02T14:09:05.306538+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: Dumping IP State Completed
2024-11-02T14:09:05.306541+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: BACO reset
2024-11-02T14:09:07.424545+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: GPU reset succeeded, trying to resume
2024-11-02T14:09:07.424554+02:00 HAL kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000200000).
2024-11-02T14:09:07.424555+02:00 HAL kernel: [drm] VRAM is lost due to GPU reset!
2024-11-02T14:09:07.424555+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: PSP is resuming...
2024-11-02T14:09:07.471540+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: reserve 0x900000 from 0x817d000000 for PSP TMR
2024-11-02T14:09:07.515537+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: RAS: optional ras ta ucode is not available
2024-11-02T14:09:07.521539+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: RAP: optional rap ta ucode is not available
2024-11-02T14:09:07.521543+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
2024-11-02T14:09:07.521543+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: SMU is resuming...
2024-11-02T14:09:07.521544+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: use vbios provided pptable
2024-11-02T14:09:07.521544+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: smc_dpm_info table revision(format.content): 4.5
2024-11-02T14:09:07.524538+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: SMU is resumed successfully!
2024-11-02T14:09:07.624536+02:00 HAL kernel: [drm] kiq ring mec 2 pipe 1 q 0
2024-11-02T14:09:07.625538+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
2024-11-02T14:09:07.625541+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
2024-11-02T14:09:07.625542+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
2024-11-02T14:09:07.625545+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
2024-11-02T14:09:07.625550+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
2024-11-02T14:09:07.625551+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
2024-11-02T14:09:07.625551+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
2024-11-02T14:09:07.625552+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
2024-11-02T14:09:07.625552+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
2024-11-02T14:09:07.625556+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: ring kiq_0.2.1.0 uses VM inv eng 11 on hub 0
2024-11-02T14:09:07.625556+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
2024-11-02T14:09:07.625557+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0
2024-11-02T14:09:07.625557+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: ring vcn_dec uses VM inv eng 0 on hub 8
2024-11-02T14:09:07.625557+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: ring vcn_enc0 uses VM inv eng 1 on hub 8
2024-11-02T14:09:07.625558+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: ring vcn_enc1 uses VM inv eng 4 on hub 8
2024-11-02T14:09:07.625558+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 8
2024-11-02T14:09:07.628536+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: recover vram bo from shadow start
2024-11-02T14:09:07.638539+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: recover vram bo from shadow done
2024-11-02T14:09:07.638543+02:00 HAL kernel: amdgpu 0000:0a:00.0: amdgpu: GPU reset(2) succeeded!
2024-11-02T14:09:07.665446+02:00 HAL systemd[2389]: xfce4-notifyd.service: Main process exited, code=exited, status=1/FAILURE
2024-11-02T14:09:07.665593+02:00 HAL systemd[2389]: xfce4-notifyd.service: Failed with result 'exit-code'.
2024-11-02T14:09:07.679902+02:00 HAL lightdm[1921]: Error opening audit socket: Protocol not supported
2024-11-02T14:09:07.792681+02:00 HAL pipewire-pulse[2412]: mod.protocol-pulse: client 0x57cd871b13d0 [VNGame.exe]: ERROR command:-1 (invalid) tag:4294967295 error:25 (Input/output error)
2024-11-02T14:09:07.793621+02:00 HAL pipewire-pulse[2412]: mod.protocol-pulse: client 0x57cd871b13d0 [VNGame.exe]: ERROR command:-1 (invalid) tag:4294967295 error:25 (Input/output error)
2024-11-02T14:09:08.268969+02:00 HAL systemd[1]: Created slice user-108.slice - User Slice of UID 108.
2024-11-02T14:09:08.269685+02:00 HAL systemd[1]: Starting user-runtime-dir@108.service - User Runtime Directory /run/user/108...
2024-11-02T14:09:08.275207+02:00 HAL systemd[1]: Finished user-runtime-dir@108.service - User Runtime Directory /run/user/108.
2024-11-02T14:09:08.276015+02:00 HAL systemd[1]: Starting user@108.service - User Manager for UID 108...
2024-11-02T14:09:08.324516+02:00 HAL systemd-xdg-autostart-generator[13160]: Configuration file /etc/xdg/autostart/xfce-superkey.desktop is marked executable. Please remove executable permission bits. Proceeding anyway.
2024-11-02T14:09:08.328899+02:00 HAL systemd-xdg-autostart-generator[13160]: /etc/xdg/autostart/mx-usb-unmounter.desktop:76: Key Type was defined multiple times, ignoring.
2024-11-02T14:09:08.400244+02:00 HAL systemd[13141]: Queued start job for default target default.target.
2024-11-02T14:09:08.400664+02:00 HAL systemd[13141]: Created slice app.slice - User Application Slice.
2024-11-02T14:09:08.400865+02:00 HAL systemd[13141]: Created slice session.slice - User Core Session Slice.
2024-11-02T14:09:08.400902+02:00 HAL systemd[13141]: Reached target paths.target - Paths.
2024-11-02T14:09:08.400940+02:00 HAL systemd[13141]: Reached target timers.target - Timers.
2024-11-02T14:09:08.401356+02:00 HAL systemd[13141]: Starting dbus.socket - D-Bus User Message Bus Socket...
2024-11-02T14:09:08.401436+02:00 HAL systemd[13141]: Listening on dirmngr.socket - GnuPG network certificate management daemon.
2024-11-02T14:09:08.401518+02:00 HAL systemd[13141]: Listening on gcr-ssh-agent.socket - GCR ssh-agent wrapper.
2024-11-02T14:09:08.401587+02:00 HAL systemd[13141]: Listening on gnome-keyring-daemon.socket - GNOME Keyring daemon.
2024-11-02T14:09:08.401640+02:00 HAL systemd[13141]: Listening on gpg-agent-browser.socket - GnuPG cryptographic agent and passphrase cache (access for web browsers).
2024-11-02T14:09:08.401691+02:00 HAL systemd[13141]: Listening on gpg-agent-extra.socket - GnuPG cryptographic agent and passphrase cache (restricted).
2024-11-02T14:09:08.401737+02:00 HAL systemd[13141]: Listening on gpg-agent-ssh.socket - GnuPG cryptographic agent (ssh-agent emulation).
2024-11-02T14:09:08.401782+02:00 HAL systemd[13141]: Listening on gpg-agent.socket - GnuPG cryptographic agent and passphrase cache.
2024-11-02T14:09:08.401863+02:00 HAL systemd[13141]: Listening on pipewire-pulse.socket - PipeWire PulseAudio.
2024-11-02T14:09:08.401932+02:00 HAL systemd[13141]: Listening on pipewire.socket - PipeWire Multimedia System Sockets.
2024-11-02T14:09:08.410895+02:00 HAL systemd[13141]: Listening on dbus.socket - D-Bus User Message Bus Socket.
2024-11-02T14:09:08.410936+02:00 HAL systemd[13141]: Reached target sockets.target - Sockets.
2024-11-02T14:09:08.410971+02:00 HAL systemd[13141]: Reached target basic.target - Basic System.
2024-11-02T14:09:08.411002+02:00 HAL systemd[1]: Started user@108.service - User Manager for UID 108.
2024-11-02T14:09:08.411454+02:00 HAL systemd[13141]: Started pipewire.service - PipeWire Multimedia Service.
2024-11-02T14:09:08.411489+02:00 HAL systemd[1]: Started session-c2.scope - Session c2 of User lightdm.
2024-11-02T14:09:08.412004+02:00 HAL systemd[13141]: Started filter-chain.service - PipeWire filter chain daemon.
2024-11-02T14:09:08.412632+02:00 HAL systemd[13141]: Started wireplumber.service - Multimedia Service Session Manager.
2024-11-02T14:09:08.413252+02:00 HAL systemd[13141]: Started pipewire-pulse.service - PipeWire PulseAudio.
2024-11-02T14:09:08.413325+02:00 HAL systemd[13141]: Reached target default.target - Main User Target.
2024-11-02T14:09:08.413373+02:00 HAL systemd[13141]: Startup finished in 129ms.
2024-11-02T14:09:08.446680+02:00 HAL systemd[13141]: Starting dbus.service - D-Bus User Message Bus...
2024-11-02T14:09:08.519260+02:00 HAL systemd[13141]: Started dbus.service - D-Bus User Message Bus.
2024-11-02T14:09:08.520649+02:00 HAL pipewire-pulse[13165]: mod.rt: RTKit error: org.freedesktop.DBus.Error.ServiceUnknown
2024-11-02T14:09:08.520698+02:00 HAL pipewire-pulse[13165]: mod.rt: RTKit does not give us MaxRealtimePriority, using 1
2024-11-02T14:09:08.520814+02:00 HAL pipewire[13162]: mod.rt: RTKit error: org.freedesktop.DBus.Error.ServiceUnknown
2024-11-02T14:09:08.521076+02:00 HAL pipewire[13162]: mod.rt: RTKit does not give us MaxRealtimePriority, using 1
2024-11-02T14:09:08.521101+02:00 HAL pipewire[13162]: mod.rt: RTKit error: org.freedesktop.DBus.Error.ServiceUnknown
2024-11-02T14:09:08.521122+02:00 HAL pipewire[13162]: mod.rt: RTKit does not give us MinNiceLevel, using 0
2024-11-02T14:09:08.521232+02:00 HAL wireplumber[13164]: RTKit error: org.freedesktop.DBus.Error.ServiceUnknown
2024-11-02T14:09:08.521279+02:00 HAL wireplumber[13164]: RTKit does not give us MaxRealtimePriority, using 1
2024-11-02T14:09:08.521303+02:00 HAL pipewire-pulse[13165]: mod.rt: RTKit error: org.freedesktop.DBus.Error.ServiceUnknown
2024-11-02T14:09:08.521323+02:00 HAL pipewire-pulse[13165]: mod.rt: RTKit does not give us MinNiceLevel, using 0
2024-11-02T14:09:08.521343+02:00 HAL pipewire[13162]: mod.rt: RTKit error: org.freedesktop.DBus.Error.ServiceUnknown
2024-11-02T14:09:08.521364+02:00 HAL pipewire[13162]: mod.rt: RTKit does not give us RTTimeUSecMax, using -1
2024-11-02T14:09:08.521394+02:00 HAL wireplumber[13164]: RTKit error: org.freedesktop.DBus.Error.ServiceUnknown
2024-11-02T14:09:08.521424+02:00 HAL wireplumber[13164]: RTKit does not give us MinNiceLevel, using 0
2024-11-02T14:09:08.521454+02:00 HAL wireplumber[13164]: RTKit error: org.freedesktop.DBus.Error.ServiceUnknown
2024-11-02T14:09:08.521480+02:00 HAL wireplumber[13164]: RTKit does not give us RTTimeUSecMax, using -1
2024-11-02T14:09:08.521504+02:00 HAL pipewire-pulse[13165]: mod.rt: RTKit error: org.freedesktop.DBus.Error.ServiceUnknown
2024-11-02T14:09:08.521523+02:00 HAL pipewire-pulse[13165]: mod.rt: RTKit does not give us RTTimeUSecMax, using -1
2024-11-02T14:09:08.524192+02:00 HAL dbus-daemon[13168]: [session uid=108 pid=13168] Activating service name='org.jackaudio.service' requested by ':1.6' (uid=108 pid=13162 comm="/usr/bin/pipewire")
2024-11-02T14:09:08.531652+02:00 HAL dbus-daemon[13168]: [session uid=108 pid=13168] Successfully activated service 'org.jackaudio.service'
2024-11-02T14:09:08.583039+02:00 HAL dbus-daemon[13168]: [session uid=108 pid=13168] Activating via systemd: service name='org.a11y.Bus' unit='at-spi-dbus-bus.service' requested by ':1.10' (uid=108 pid=13167 comm="/usr/sbin/lightdm-gtk-greeter")
2024-11-02T14:09:08.584416+02:00 HAL systemd[13141]: Starting at-spi-dbus-bus.service - Accessibility services bus...

Thank you in advance

User avatar
j2mcgreg
Global Moderator
Posts: 7127
Joined: Tue Oct 23, 2007 12:04 pm

Re: AMD GPU resets randomly

#2 Post by j2mcgreg »

@Hooten

Code: Select all

amdgpu temp: 77.0 C 
It's likely spiking much higher under load which will cause the gpu to go into self-preservation mode and shut down. Add a fan that blows exterior air directly onto the gpu to lower that temperature.
Keep this in mind. A component that has been designed for Windows and works in Windows means very little in Linux.
HP 15; ryzen 3 5300U APU; 500 Gb SSD; 8GB ram
HP 17; ryzen 3 3200; 500 GB SSD; 12 GB ram
Idea Center 3; 12 gen i5; 256 GB ssd;

In Linux, newer isn't always better. The best solution is the one that works.

User avatar
Hooten
Posts: 67
Joined: Sat May 05, 2018 5:52 pm

Re: AMD GPU resets randomly

#3 Post by Hooten »

j2mcgreg wrote: Sun Nov 03, 2024 6:49 am @Hooten

Code: Select all

amdgpu temp: 77.0 C 
It's likely spiking much higher under load which will cause the gpu to go into self-preservation mode and shut down. Add a fan that blows exterior air directly onto the gpu to lower that temperature.
Keep this in mind. A component that has been designed for Windows and works in Windows means very little in Linux.
It happened also while I wasn't gaming, just browsing the web. Also 77.0 C under load for a gpu it's pretty normal and the thing I mentioned about this issue not happening on Windows it was to exclude faulty hardware, not to bash on Linux. In any case, I installed LACT in order to manually set a profile for gpu speed fans and I'll monitor the situation. Thanks for the suggestion.

User avatar
j2mcgreg
Global Moderator
Posts: 7127
Joined: Tue Oct 23, 2007 12:04 pm

Re: AMD GPU resets randomly

#4 Post by j2mcgreg »

Your QSI shows that you are using a Display Port connection. Does it also happen if you switch to HDMI?
HP 15; ryzen 3 5300U APU; 500 Gb SSD; 8GB ram
HP 17; ryzen 3 3200; 500 GB SSD; 12 GB ram
Idea Center 3; 12 gen i5; 256 GB ssd;

In Linux, newer isn't always better. The best solution is the one that works.

User avatar
Kermit the Frog
Posts: 626
Joined: Mon Jul 08, 2024 8:52 am

Re: AMD GPU resets randomly

#5 Post by Kermit the Frog »

Code: Select all

... [drm] VRAM is lost due to GPU reset!
It seems you're not alone. You can try this:
https://community.frame.work/t/responded-vram-is-lost-due-to-gpu-reset-followed-by-a-crash/48367 wrote:
northivanastan:

... Setting the kernel parameter

Code: Select all

amdgpu.sg_display=0
and the GPU mode to UMA_GAME_OPTIMIZED in BIOS settings also helped.

User avatar
Hooten
Posts: 67
Joined: Sat May 05, 2018 5:52 pm

Re: AMD GPU resets randomly

#6 Post by Hooten »

Thanks everyone for your responses, really appreciated. Currently I'm following the suggestion of j2mcgreg, setting a manual fan speed profile and I'm waiting to see if the crash will occur. If it does I'll continue with the next suggestion.

User avatar
Hooten
Posts: 67
Joined: Sat May 05, 2018 5:52 pm

Re: AMD GPU resets randomly

#7 Post by Hooten »

Here I am again. I tried everything you suggested me, nothing actually worked unfortunately. The last thing I've done is to switch into hdmi instead of Display Port. At some point I was getting a black screen for few seconds but gpu didn't reset, this was occurring every few seconds but eventually crashed and I got prompted to logic screen again. So yeah at this point I have no idea what else to troubleshoot.

User avatar
siamhie
Global Moderator
Posts: 3709
Joined: Fri Aug 20, 2021 5:45 pm

Re: AMD GPU resets randomly

#8 Post by siamhie »

What are your temps at idle?


Here's my RX 6700 XT at idle showing 44°C in LACT. This is the curve I have it set for.

lact.jpg
You do not have the required permissions to view the files attached to this post.
This is my Fluxbox . There are many others like it, but this one is mine. My Fluxbox is my best friend. It is my life.
I must master it as I must master my life. Without me, my Fluxbox is useless. Without my Fluxbox, I am useless.

User avatar
Hooten
Posts: 67
Joined: Sat May 05, 2018 5:52 pm

Re: AMD GPU resets randomly

#9 Post by Hooten »

siamhie wrote: Tue Nov 05, 2024 7:42 pm What are your temps at idle?


Here's my RX 6700 XT at idle showing 44°C in LACT. This is the curve I have it set for.


lact.jpg
It's 34c at idle. I've set a similar curve but didn't help.

Post Reply

Return to “MX Help”