Page 1 of 1

which magic sysreq keys to gather info on a total system freeze

Posted: Mon Nov 06, 2023 2:46 pm
by Danathar
I've been searching around, and I can't seem to get a clean answer so maybe somebody with more knowledge could give some advice?

I've experienced 2 total system freezes, hard enough that the caps lock becomes unresponsive. I've tried the 6.5 kernel in the respos just in case it was something with the current kernel I was running. I'm on MX 23.1 with KDE and Kernel 6.1.

After doing some searching it seems like I should maybe try some magic sysreq keys to gather more data? Ive checked the logs after reboot, kern.log, syslog, etc and there is simply nothing there. It's a clean break EXCEPT for a bunch of ^@ characters at the part of the syslog where the froze occured. Is this null or something?

There is a WHOLE BUNCH of magic sysreq keys to use here https://www.kernel.org/doc/html/latest/ ... sysrq.html I have no idea which ones would be the best to do.

thoughts?

Greatly appreciate any help. I'd hate to have to toss this laptop.

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Mon Nov 06, 2023 3:00 pm
by Charlie Brown

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Mon Nov 06, 2023 3:08 pm
by Danathar
Charlie Brown wrote: Mon Nov 06, 2023 3:00 pm R E I S U B
That might keep me from doing a hard button reset, but will that help me tell why it occurred? thank you BTW. That is helpful!

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Mon Nov 06, 2023 3:21 pm
by CharlesV
Shutting down as gracefully as possible will help the logs to be written.

Look in the /var/log folder at the various logs.

Video problems are usually logged in /var/log/Xorg.0.log.

Problems detected by the kernel are logged in /var/log/kern.log or /var/log/messages

Logs with .0. are previous logs.

also the QSI logs are a good place to start.

And usually, lock ups are because of video / audio or ram. (not always ) Turning off Hardware acceleration in your web browser is a great starting point too.

If you will post your QSI that also might give some indicators.. and what applications are running when you encounter the lock up?

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Mon Nov 06, 2023 3:26 pm
by CharlesV
Some additional information about commands that can help "gracefully" reboot and write out the log files.

https://linuxhandbook.com/frozen-linux-system/

Sysrec.jpg

Code: Select all

Keys 	Description
Alt + SysRq + r 	Takes the keyboard out of raw mode, taking control away from X
Alt + SysRq + e 	Send SigTerm to all process, giving them a chance to quit gracefully
Alt + SysRq + i 	Send SigKill to all process
Alt + SysRq + k 	Kill all process in current virtual console
Alt + SysRq + s 	Sync all mounted filesystem, flushing all data to disk
Alt + SysRq + u 	Remount all filesystem read-only
Alt + SysRq + b 	Reboot system instantly, does not sync or unmount
Alt + SysRq + o 	Shutdown system

Re: which magic sysreq keys to gather info on a total system freeze  [Solved]

Posted: Mon Nov 06, 2023 3:27 pm
by Danathar
CharlesV wrote: Mon Nov 06, 2023 3:21 pm Shutting down as gracefully as possible will help the logs to be written.

Look in the /var/log folder at the various logs.

Video problems are usually logged in /var/log/Xorg.0.log.

Problems detected by the kernel are logged in /var/log/kern.log or /var/log/messages

Logs with .0. are previous logs.

also the QSI logs are a good place to start.

And usually, lock ups are because of video / audio or ram. (not always ) Turning off Hardware acceleration in your web browser is a great starting point too.

If you will post your QSI that also might give some indicators.. and what applications are running when you encounter the lock up?
Thanks. Next time that crash comes around I will post it. The only app was firefox. I had forgotten that in the past I'd encountered (not on MX, but on Ubuntu) a re-producible freeze on firefox playing a youtube channel. Turning off hardware acceleration in the browser fixed it. I'll turn it off and see if I continue to get that.

It makes sense that shutting down gracefully can write stuff you wouldn't see doing a hard reset.

thank you!

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Mon Nov 06, 2023 3:28 pm
by CharlesV
Your very welcome and hope that resolves it :-)

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Mon Nov 06, 2023 3:29 pm
by Danathar
CharlesV wrote: Mon Nov 06, 2023 3:26 pm Some additional information about commands that can help "gracefully" reboot and write out the log files.

https://linuxhandbook.com/frozen-linux-system/


Sysrec.jpg

Code: Select all

Keys 	Description
Alt + SysRq + r 	Takes the keyboard out of raw mode, taking control away from X
Alt + SysRq + e 	Send SigTerm to all process, giving them a chance to quit gracefully
Alt + SysRq + i 	Send SigKill to all process
Alt + SysRq + k 	Kill all process in current virtual console
Alt + SysRq + s 	Sync all mounted filesystem, flushing all data to disk
Alt + SysRq + u 	Remount all filesystem read-only
Alt + SysRq + b 	Reboot system instantly, does not sync or unmount
Alt + SysRq + o 	Shutdown system
thanks! One interesting thing I found out, at least on the Dell I'm troubleshooting. Its only the left alt that works with printscr to do sysreq. Right alt does not work.

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Mon Nov 06, 2023 3:31 pm
by CharlesV
I saw this on a laptop I was working on some time back, but I thought it was due to a special map key or something. Interesting!

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Mon Nov 06, 2023 6:35 pm
by MXRobo

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Mon Nov 06, 2023 9:53 pm
by Charlie Brown
Danathar wrote: Mon Nov 06, 2023 3:08 pm... why it occurred?..
In fact I was going to write the same thing: The main goal should be to prevent that. If that happens randomly the best is to take the rams out then re-seat. That solved random / unknown freezing issues many times.

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Tue Nov 07, 2023 10:06 am
by Danathar
CharlesV wrote: Mon Nov 06, 2023 3:28 pm Your very welcome and hope that resolves it :-)
I had one more freeze yesterday, but I had forgotten to restart firefox after disabling hardware accleration, which I've read you need to do. We'll see today if we crash again now that I've restarted FF.

Wow, if it is the Intel driver again it really takes it down HARD. None of the sysreq keys worked yesterday when it froze again :(

Hmm..I wonder if I can get around this by running Firefox with hardware accleration on the nvidia chipset? I've got one of those dell laptops that have both, I could set the prime variables and launch firefox with the nvidia p1000 onboard instead of the intel chip. Could be interesting if that fixed it.

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Tue Nov 07, 2023 10:08 am
by Danathar
Charlie Brown wrote: Mon Nov 06, 2023 9:53 pm
Danathar wrote: Mon Nov 06, 2023 3:08 pm... why it occurred?..
In fact I was going to write the same thing: The main goal should be to prevent that. If that happens randomly the best is to take the rams out then re-seat. That solved random / unknown freezing issues many times.
I really do think it's the intel video driver with firefox hardware acceleration turned on, but I'll probably do a full memtest to test the ram out just in case. It's new crucial sticks, so I'd be miffed if it was bad memory. Re-seating had not occurred to me though, I'll definitely do that if the freeze happens again.

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Tue Nov 07, 2023 10:45 am
by Charlie Brown
Ok, but ( I just felt lazy to add that when typing) re-seating worked all times when mem-test showed everything was fine. Therefore I began skipping suggesting mem-test :)

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Tue Nov 21, 2023 9:34 am
by Danathar
CharlesV wrote: Mon Nov 06, 2023 3:28 pm Your very welcome and hope that resolves it :-)
Well, sadly I'm still getting crashes about one every other day. Just this morning I went to alt-tilde in KDE and the whole system froze. The only thing in the syslog (nothing in kern or X11) are a bunch of reverse white background colored ^@ symbols. I turned on kdump but the file system looks like it's getting the rug pulled out from underneath it and never has any time to log anything.

Code: Select all

2023-11-21T09:07:44.258875-05:00 dell-mx polkit-kde-authentication-agent-1[2761]: Another client is already authenticating, please try again later.
2023-11-21T09:07:44.259205-05:00 dell-mx plasmashell[23372]: Error executing command as another user: Not authorized
2023-11-21T09:07:44.259223-05:00 dell-mx plasmashell[23372]: This incident has been reported.
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@2023-11-21T09:09:45.496622-05:00 dell-mx kernel: [    0.000000] Linux version 6.1.0-10-amd64 (debian-kernel@lists.debian.org) (gcc-12 (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC Debian 6.1.38-2 (2023-07-27)
2023-11-21T09:09:45.496665-05:00 dell-mx kernel: [    0.000000] Command line: BOOT_IMAGE=/vmlinuz-6.1.0-10-amd64 root=UUID=8bfb9e66-c226-4b5d-bf82-fe277e36040b ro quiet splash crashkernel=1G resume=UUID=8bfb9e66-c226-4b5d-bf82-fe277e36040b resume_offset=59836416 crashkernel=384M-:128M init=/lib/systemd/systemd
Do you have any other suggestions? Memtext86 ran for 6 hours and did multiple passes without issues, so I think my memory is ok.

Maybe I should drop back to the 4 series nvidia driver. Are there some docs on how to do that?

thanks

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Tue Nov 21, 2023 10:53 am
by CharlesV
Please post your QSI and lets see what this system consists of.

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Tue Nov 21, 2023 11:11 am
by Danathar
CharlesV wrote: Tue Nov 21, 2023 10:53 am Please post your QSI and lets see what this system consists of.
Apologies, I forgot about that. Note that I've uninstalled the nvidia 525 drivers with:

Code: Select all

sudo ddm-mx -p nvidia
to see if the freezing stops, but I'm not hopeful as the P1000 GPU only kicks in when I tell it to with prime, otherwise the intel graphics are used. Also I have kdump turned on just in case I could catch a crash, but alas...

Code: Select all

[CODE]System:
  Kernel: 6.1.0-10-amd64 [6.1.38-2] arch: x86_64 bits: 64 compiler: gcc v: 12.2.0
    parameters: BOOT_IMAGE=/vmlinuz-6.1.0-10-amd64 root=UUID=<filter> ro quiet splash crashkernel=1G
    resume=UUID=<filter> resume_offset=59836416 crashkernel=384M-:128M init=/lib/systemd/systemd
  Desktop: KDE Plasma v: 5.27.5 wm: kwin_x11 vt: 7 dm: SDDM Distro: MX-23.1_KDE_x64 Libretto
    July 31 2023 base: Debian GNU/Linux 12 (bookworm)
Machine:
  Type: Laptop System: Dell product: Precision 5530 v: N/A serial: <superuser required> Chassis:
    type: 10 serial: <superuser required>
  Mobo: Dell model: 0NFGCT v: A00 serial: <superuser required> UEFI: Dell v: 1.32.0
    date: 07/05/2023
Battery:
  ID-1: BAT0 charge: 73.2 Wh (100.0%) condition: 73.2/97.0 Wh (75.4%) volts: 12.5 min: 11.4
    model: LGC-LGC8.33 DELL 5XJ288A type: Li-ion serial: <filter> status: full
CPU:
  Info: model: Intel Core i7-8850H bits: 64 type: MT MCP arch: Coffee Lake gen: core 8 level: v3
    note: check built: 2018 process: Intel 14nm family: 6 model-id: 0x9E (158) stepping: 0xA (10)
    microcode: 0xF4
  Topology: cpus: 1x cores: 6 tpc: 2 threads: 12 smt: enabled cache: L1: 384 KiB
    desc: d-6x32 KiB; i-6x32 KiB L2: 1.5 MiB desc: 6x256 KiB L3: 9 MiB desc: 1x9 MiB
  Speed (MHz): avg: 1466 high: 2600 min/max: 800/4300 scaling: driver: intel_pstate
    governor: powersave cores: 1: 900 2: 900 3: 2600 4: 900 5: 900 6: 900 7: 900 8: 900 9: 2600
    10: 2600 11: 900 12: 2600 bogomips: 62399
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
  Vulnerabilities:
  Type: itlb_multihit status: KVM: VMX disabled
  Type: l1tf mitigation: PTE Inversion; VMX: conditional cache flushes, SMT vulnerable
  Type: mds mitigation: Clear CPU buffers; SMT vulnerable
  Type: meltdown mitigation: PTI
  Type: mmio_stale_data mitigation: Clear CPU buffers; SMT vulnerable
  Type: retbleed mitigation: IBRS
  Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via prctl
  Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer sanitization
  Type: spectre_v2 mitigation: IBRS, IBPB: conditional, STIBP: conditional, RSB filling,
    PBRSB-eIBRS: Not affected
  Type: srbds mitigation: Microcode
  Type: tsx_async_abort mitigation: TSX disabled
Graphics:
  Device-1: Intel CoffeeLake-H GT2 [UHD Graphics 630] vendor: Dell driver: i915 v: kernel
    arch: Gen-9.5 process: Intel 14nm built: 2016-20 ports: active: eDP-1 empty: DP-1,DP-2,DP-3
    bus-ID: 00:02.0 chip-ID: 8086:3e9b class-ID: 0300
  Device-2: NVIDIA GP107GLM [Quadro P1000 Mobile] vendor: Dell driver: N/A alternate: nouveau
    non-free: 530.xx+ status: current (as of 2023-03) arch: Pascal code: GP10x process: TSMC 16nm
    built: 2016-21 pcie: gen: 3 speed: 8 GT/s lanes: 16 bus-ID: 01:00.0 chip-ID: 10de:1cbb
    class-ID: 0302
  Device-3: Microdia Integrated_Webcam_HD type: USB driver: uvcvideo bus-ID: 1-12:5
    chip-ID: 0c45:671d class-ID: 0e02
  Display: x11 server: X.Org v: 1.21.1.7 with: Xwayland v: 22.1.9 compositor: kwin_x11 driver: X:
    loaded: modesetting unloaded: fbdev,vesa dri: iris gpu: i915 display-ID: :0 screens: 1
  Screen-1: 0 s-res: 1920x1080 s-dpi: 96 s-size: 508x285mm (20.00x11.22") s-diag: 582mm (22.93")
  Monitor-1: eDP-1 model: Sharp 0x149a built: 2018 res: 1920x1080 hz: 60 dpi: 142 gamma: 1.2
    size: 344x194mm (13.54x7.64") diag: 395mm (15.5") ratio: 16:9 modes: 1920x1080
  API: OpenGL v: 4.6 Mesa 23.1.2-1~mx23ahs renderer: Mesa Intel UHD Graphics 630 (CFL GT2)
    direct-render: Yes
Audio:
  Device-1: Intel Cannon Lake PCH cAVS vendor: Dell driver: snd_hda_intel v: kernel
    alternate: snd_soc_skl,snd_sof_pci_intel_cnl bus-ID: 00:1f.3 chip-ID: 8086:a348 class-ID: 0403
  API: ALSA v: k6.1.0-10-amd64 status: kernel-api tools: alsamixer,amixer
  Server-1: PipeWire v: 0.3.65 status: active with: 1: pipewire-pulse status: active
    2: wireplumber status: active 3: pipewire-alsa type: plugin 4: pw-jack type: plugin
    tools: pactl,pw-cat,pw-cli,wpctl
Network:
  Device-1: Intel Wireless-AC 9260 driver: iwlwifi v: kernel modules: wl pcie: gen: 2 speed: 5 GT/s
    lanes: 1 bus-ID: 3b:00.0 chip-ID: 8086:2526 class-ID: 0280
  IF: wlan0 state: up mac: <filter>
  IF-ID-1: tailscale0 state: unknown speed: -1 duplex: full mac: N/A
  IF-ID-2: virbr0 state: down mac: <filter>
  IF-ID-3: ztnfahjkne state: unknown speed: 10 Mbps duplex: full mac: <filter>
Bluetooth:
  Device-1: Intel Wireless-AC 9260 Bluetooth Adapter type: USB driver: btusb v: 0.8 bus-ID: 1-4:3
    chip-ID: 8087:0025 class-ID: e001
  Report: hciconfig ID: hci0 rfk-id: 3 state: up address: <filter> bt-v: 3.0 lmp-v: 5.1
    sub-v: 100 hci-v: 5.1 rev: 100
  Info: acl-mtu: 1021:4 sco-mtu: 96:6 link-policy: rswitch sniff link-mode: peripheral accept
    service-classes: rendering, capturing, object transfer, audio, telephony
Drives:
  Local Storage: total: 931.51 GiB used: 137.89 GiB (14.8%)
  SMART Message: Unable to run smartctl. Root privileges required.
  ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Samsung model: SSD 980 1TB size: 931.51 GiB
    block-size: physical: 512 B logical: 512 B speed: 31.6 Gb/s lanes: 4 type: SSD serial: <filter>
    rev: 3B4QFXO7 temp: 30.9 C scheme: GPT
Partition:
  ID-1: / raw-size: 930.25 GiB size: 914.57 GiB (98.31%) used: 136.98 GiB (15.0%) fs: ext4
    dev: /dev/dm-0 maj-min: 253:0 mapped: luks-<filter>
  ID-2: /boot raw-size: 1024 MiB size: 973.4 MiB (95.06%) used: 927 MiB (95.2%) fs: ext4
    dev: /dev/nvme0n1p2 maj-min: 259:2
  ID-3: /boot/efi raw-size: 256 MiB size: 252 MiB (98.46%) used: 274 KiB (0.1%) fs: vfat
    dev: /dev/nvme0n1p1 maj-min: 259:1
Swap:
  Kernel: swappiness: 15 (default 60) cache-pressure: 100 (default)
  ID-1: swap-1 type: file size: 36.97 GiB used: 0 KiB (0.0%) priority: -2 file: /swap/swap
  ID-2: swap-2 type: zram size: 23.13 GiB used: 1024 KiB (0.0%) priority: 100 dev: /dev/zram0
Sensors:
  System Temperatures: cpu: 50.0 C pch: 44.0 C mobo: 49.0 C
  Fan Speeds (RPM): cpu: 2520 fan-1: 2493
Repos:
  Packages: 2714 pm: dpkg pkgs: 2699 libs: 1488 tools: apt,apt-get,aptitude,nala pm: rpm pkgs: 0
    pm: flatpak pkgs: 15
  No active apt repos in: /etc/apt/sources.list
  Active apt repos in: /etc/apt/sources.list.d/1password.list
    1: deb [arch=amd64 signed-by=/usr/share/keyrings/1password-archive-keyring.gpg] https://downloads.1password.com/linux/debian/amd64 stable main
  Active apt repos in: /etc/apt/sources.list.d/debian-stable-updates.list
    1: deb http://deb.debian.org/debian bookworm-updates main contrib non-free non-free-firmware
  Active apt repos in: /etc/apt/sources.list.d/debian.list
    1: deb http://deb.debian.org/debian bookworm main contrib non-free non-free-firmware
    2: deb http://security.debian.org/debian-security bookworm-security main contrib non-free non-free-firmware
  Active apt repos in: /etc/apt/sources.list.d/microsoft-edge.list
    1: deb [arch=amd64] https://packages.microsoft.com/repos/edge/ stable main
  Active apt repos in: /etc/apt/sources.list.d/mx.list
    1: deb http://mirrors.rit.edu/mxlinux/mx-packages/mx/repo/ bookworm main non-free
    2: deb http://mirrors.rit.edu/mxlinux/mx-packages/mx/repo/ bookworm ahs
  Active apt repos in: /etc/apt/sources.list.d/signal-xenial-added-by-mxpi.list
    1: deb [arch=amd64] https://updates.signal.org/desktop/apt xenial main
  Active apt repos in: /etc/apt/sources.list.d/tailscale.list
    1: deb [signed-by=/usr/share/keyrings/tailscale-archive-keyring.gpg] https://pkgs.tailscale.com/stable/debian bookworm main
  Active apt repos in: /etc/apt/sources.list.d/zerotier.list
    1: deb http://download.zerotier.com/debian/bookworm bookworm main
Info:
  Processes: 298 Uptime: 1h 15m wakeups: 5659 Memory: 30.85 GiB used: 5.14 GiB (16.7%)
  Init: systemd v: 252 target: graphical (5) default: graphical tool: systemctl Compilers:
  gcc: 12.2.0 alt: 12 Client: shell wrapper v: 5.2.15-release inxi: 3.3.26
Boot Mode: UEFI

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Tue Nov 21, 2023 11:32 am
by CharlesV
Interested in that zram swap :-)

have you tried running without that and if so are you still seeing crashes / lockups ?

Your machine has enough ram, and a simple, small swap file probably wont even be used... so why the zram swap?

And I dont recall if you tried it... but have you tried a liquroix kernel?

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Tue Nov 21, 2023 11:37 am
by l0dr3
@Danathar
Kernel: 6.1.0-10-amd64 [6.1.38-2]
Just from my experience from the last months: i found the whole 6.1.xx kernel releases very unstable in conjunction with intel integrated GPUs of all ages, so i upgraded all my intel-gpu-driven devices to at least kernels >=6.4.xx (AHS ones and liquorix ones, both work like a charm :number1: )

greetz l0dr3

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Tue Nov 21, 2023 11:45 am
by Danathar
l0dr3 wrote: Tue Nov 21, 2023 11:37 am @Danathar
Kernel: 6.1.0-10-amd64 [6.1.38-2]
Just from my experience from the last months: i found the whole 6.1.xx kernel releases very unstable in conjunction with intel integrated GPUs of all ages, so i upgraded all my intel-gpu-driven devices to at least kernels >=6.4.xx (AHS ones and liquorix ones, both work like a charm :number1: )

greetz l0dr3
I ran with both those kernels and still got crashes :( Thanks for the suggestion though.

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Tue Nov 21, 2023 11:48 am
by Danathar
CharlesV wrote: Tue Nov 21, 2023 11:32 am Interested in that zram swap :-)

have you tried running without that and if so are you still seeing crashes / lockups ?

Your machine has enough ram, and a simple, small swap file probably wont even be used... so why the zram swap?

And I dont recall if you tried it... but have you tried a liquroix kernel?
It is a good point on the swap. I had it configured on other systems and I guess it's just part of my default setup, but you are right with the amount of memory on this laptop I probably don't need it. I have tried the liquoroix kernel. Crashed on that as well. Do you really think zram swap could cause that? I've never encountered crashes with that on.

There has GOT to be a way to collect information on this.

WWLTD!!! (What would Linus Torvalds do?!) ;)

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Tue Nov 21, 2023 11:50 am
by Charlie Brown
"Reseat".

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Tue Nov 21, 2023 11:58 am
by Charlie Brown
In the meantime (off-topic) :

/boot ... used: 95.2%

when booted with the kernel you want to keep: "MX Cleanup" - Kernel Removal Tool (below the window). Remove unnecessary kernels simply with 2 clicks and open some space.

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Tue Nov 21, 2023 12:01 pm
by l0dr3
@Danathar

Have you tried to find an issue with 'DELL Pre-Boot Diagnostics' ??
https://www.dell.com/support/kbdoc/en-u ... sa-and-psa

And 2nd: from your QSI ...
Battery:
ID-1: BAT0 charge: 73.2 Wh (100.0%) condition: 73.2/97.0 Wh (75.4%) volts: 12.5 min: 11.4
model: LGC-LGC8.33 DELL 5XJ288A type: Li-ion serial: <filter> status: full
This indicates, that your battery is in a degraded state! I'll highly recommend, that you check, whether its 'swollen' and therefore may put mechanical pressure on the Motherboard :exclamation:

https://www.dell.com/support/manuals/en ... lenbattery

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Tue Nov 21, 2023 12:02 pm
by Danathar
Charlie Brown wrote: Tue Nov 21, 2023 11:58 am In the meantime (off-topic) :

/boot ... used: 95.2%

when booted with the kernel you want to keep: "MX Cleanup" - Kernel Removal Tool (below the window). Remove unnecessary kernels simply with 2 clicks and open some space.
Yea, I noticed that too and got rid of the un-needed ones. I also fell back to the nvidia 470 driver. I know it's bad science to change more than one variable at at time...but maybe

I SO REGRET buying this Dell. I should of done the SANE thing and bought a thinkpad. For some reason I thought it would be cool to have a decent GPU on the laptop.

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Tue Nov 21, 2023 12:03 pm
by Danathar
l0dr3 wrote: Tue Nov 21, 2023 12:01 pm @Danathar

Have you tried to find an issue with 'DELL Pre-Boot Diagnostics' ??
https://www.dell.com/support/kbdoc/en-u ... sa-and-psa

And 2nd: from your QSI ...
Battery:
ID-1: BAT0 charge: 73.2 Wh (100.0%) condition: 73.2/97.0 Wh (75.4%) volts: 12.5 min: 11.4
model: LGC-LGC8.33 DELL 5XJ288A type: Li-ion serial: <filter> status: full
This indicates, that your battery is in a degraded state! I'll highly recommend, that you check, whether its 'swollen' and therefore may put mechanical pressure on the Motherboard :exclamation:

https://www.dell.com/support/manuals/en ... lenbattery
Thanks on that. When I bought this used, it that was the first thing I checked, it's fine though and on my list of things to fix (if that is..I end up keeping it which if this keeps up...)

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Tue Nov 21, 2023 12:05 pm
by Danathar
Charlie Brown wrote: Tue Nov 21, 2023 11:50 am "Reseat".
I ran memtest86 for 6 hours with no errors. Why would re-seating make a difference?

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Tue Nov 21, 2023 12:06 pm
by CharlesV
+1 @Charlie Brown

The cool thing about zram is with a fast machine you can store more in swap ... the bad thing ... it uses lots of cpu to compress - And I do not KNOW this for a fact, but I have read that zram compression is BEST when it uses gpu cycles ...

The crash your are talking about is pretty much going to kill any log writing other than 'unexpected...' and when things go bad in ram or overlaps is when you get bad news lockups.

As for Linus - I read his 'words of wisdom' were to "keep it simple - one thing for one job" To me, applied to your issue that means remove as many variables as you can - make it all as simple as possible, with as little running as possible - once your stable, then add back in apps / processes, one at a time, with plenty of time to see if your still stable.

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Tue Nov 21, 2023 12:06 pm
by CharlesV
And sorry to say ... I never recommend Dell .. too many things are 'custom or seconds' (in my opinion!)

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Tue Nov 21, 2023 12:17 pm
by Charlie Brown
Danathar wrote: Tue Nov 21, 2023 12:05 pm
Charlie Brown wrote: Tue Nov 21, 2023 11:50 am "Reseat".
I ran memtest86 for 6 hours with no errors. Why would re-seating make a difference?
Experience from previous threads. The very same thing happened many times. As I said: Mem-Test shows everything fine, that's why I began skipping the mem-test suggestion.

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Tue Nov 21, 2023 12:21 pm
by Danathar
Charlie Brown wrote: Tue Nov 21, 2023 12:17 pm
Danathar wrote: Tue Nov 21, 2023 12:05 pm
Charlie Brown wrote: Tue Nov 21, 2023 11:50 am "Reseat".
I ran memtest86 for 6 hours with no errors. Why would re-seating make a difference?
Experience from previous threads. The very same thing happened many times. As I said: Mem-Test shows everything fine, that's why I began skipping the mem-test suggestion.
Thanks on that. I'll open er up this evening and try to re-seat the dimm. I did install the extra memory but was pretty sure at the time it was seated correctly.

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Tue Nov 21, 2023 12:25 pm
by Charlie Brown
Ok. Just take them out (no matter they were loose or not).

Ok, I hear you ask: "Say one of them were loose etc. then why does it show ok on test or why does it show all ok on system infos etc.?" .. I wish I knew. I, too was surprised when that solved the issue the first time :)

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Tue Nov 21, 2023 12:49 pm
by CharlesV
Danathar wrote: Tue Nov 21, 2023 12:21 pm
Thanks on that. I'll open er up this evening and try to re-seat the dimm. I did install the extra memory but was pretty sure at the time it was seated correctly.
OH.. what?

So.. one chip is ram you added?

Did you get the EXACT same mfg and ram as the other ram chip?

RIGHT HERE.. is where my issue starts with Dell.. I have seen WAY too many machines with two different mfg's of ram be a problem. The ram doesnt have to be BAD, just different timing - Dell uses 'custom or seconds' in many cases... and a bad ram timing in handing off a memory page can cause exactly what your describing here. I have fixed far too many issues with 'different manufacture ram' than I can count - swapping out either to two sticks of the exact original ram, or in most cases, buying another stick of the second ram and using two of those.

Have you tested this crash issue without the added ram?

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Tue Nov 21, 2023 12:53 pm
by Danathar
CharlesV wrote: Tue Nov 21, 2023 12:49 pm
Danathar wrote: Tue Nov 21, 2023 12:21 pm
Thanks on that. I'll open er up this evening and try to re-seat the dimm. I did install the extra memory but was pretty sure at the time it was seated correctly.
OH.. what?

So.. one chip is ram you added?

Did you get the EXACT same mfg and ram as the other ram chip?

RIGHT HERE.. is where my issue starts with Dell.. I have seen WAY too many machines with two different mfg's of ram be a problem. The ram doesnt have to be BAD, just different timing - Dell uses 'custom or seconds' in many cases... and a bad ram timing in handing off a memory page can cause exactly what your describing here. I have fixed far too many issues with 'different manufacture ram' than I can count - swapping out either to two sticks of the exact original ram, or in most cases, buying another stick of the second ram and using two of those.

Have you tested this crash issue without the added ram?
I have not but when I pull it out I’ll double check the specs just to be sure. I’ll also do a quick dmidecode and paste it here before I pull it.

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Tue Nov 21, 2023 1:43 pm
by Danathar
CharlesV wrote: Tue Nov 21, 2023 12:49 pm
Danathar wrote: Tue Nov 21, 2023 12:21 pm
Thanks on that. I'll open er up this evening and try to re-seat the dimm. I did install the extra memory but was pretty sure at the time it was seated correctly.
OH.. what?

So.. one chip is ram you added?

Did you get the EXACT same mfg and ram as the other ram chip?

RIGHT HERE.. is where my issue starts with Dell.. I have seen WAY too many machines with two different mfg's of ram be a problem. The ram doesnt have to be BAD, just different timing - Dell uses 'custom or seconds' in many cases... and a bad ram timing in handing off a memory page can cause exactly what your describing here. I have fixed far too many issues with 'different manufacture ram' than I can count - swapping out either to two sticks of the exact original ram, or in most cases, buying another stick of the second ram and using two of those.

Have you tested this crash issue without the added ram?
I remember now, I got the exact same model SODIMM. I'll still pull it to see. I'm not using the extra memory at the moment and 16GB is more than enough in most situations.

Code: Select all

$ sudo dmidecode --type 17
# dmidecode 3.4
Getting SMBIOS data from sysfs.
SMBIOS 3.1.1 present.

Handle 0x003F, DMI type 17, 40 bytes
Memory Device
        Array Handle: 0x003E
        Error Information Handle: Not Provided
        Total Width: 64 bits
        Data Width: 64 bits
        Size: 16 GB
        Form Factor: SODIMM
        Set: None
        Locator: DIMM A
        Bank Locator: Not Specified
        Type: DDR4
        Type Detail: Synchronous
        Speed: 2667 MT/s
        Manufacturer: 80CE000080CE
        Serial Number: 40FDEBCC
        Asset Tag: 02184100
        Part Number: M471A2K43CB1-CTD    
        Rank: 2
        Configured Memory Speed: 2667 MT/s
        Minimum Voltage: 1.2 V
        Maximum Voltage: 1.2 V
        Configured Voltage: 1.2 V

Handle 0x0040, DMI type 17, 40 bytes
Memory Device
        Array Handle: 0x003E
        Error Information Handle: Not Provided
        Total Width: 64 bits
        Data Width: 64 bits
        Size: 16 GB
        Form Factor: SODIMM
        Set: None
        Locator: DIMM B
        Bank Locator: Not Specified
        Type: DDR4
        Type Detail: Synchronous
        Speed: 2667 MT/s
        Manufacturer: 80CE000080CE
        Serial Number: 40FD9FDC
        Asset Tag: 02184100
        Part Number: M471A2K43CB1-CTD    
        Rank: 2
        Configured Memory Speed: 2667 MT/s
        Minimum Voltage: 1.2 V
        Maximum Voltage: 1.2 V
        Configured Voltage: 1.2 V

Re: which magic sysreq keys to gather info on a total system freeze

Posted: Tue Nov 21, 2023 2:05 pm
by CharlesV
Cool. yes, I would try just one chip and see. (And reseating too as Charlie suggests. ) And so your running Samsung ram in both slots - good! :-)