From MAILER-DAEMON Sun Apr 30 21:42:52 2023
Received: from list by lists.fsf.org with archive (Exim 4.90_1)
	id 1ptIZ2-0003vj-3f
	for mharc-tech-volunteer-meeting@fsf.org; Sun, 30 Apr 2023 21:42:52 -0400
Received: from mail.fsf.org ([2001:470:142::13])
 by lists.fsf.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <rwp@proulx.com>) id 1ptIZ0-0003vX-19
 for tech-volunteer-meeting@fsf.org; Sun, 30 Apr 2023 21:42:50 -0400
Received: from havoc.proulx.com ([198.99.81.74]) by mail.fsf.org with esmtps
 (TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256)
 (Exim 4.93) (envelope-from <rwp@proulx.com>) id 1ptIYy-008bNL-Er
 for tech-volunteer-meeting@fsf.org; Sun, 30 Apr 2023 21:42:49 -0400
Received: from joseki.proulx.com (localhost [127.0.0.1])
 by havoc.proulx.com (Postfix) with ESMTP id A4B47650
 for <tech-volunteer-meeting@fsf.org>; Sun, 30 Apr 2023 19:42:42 -0600 (MDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=proulx.com;
 s=dkim2048; t=1682905362;
 bh=4q7QNLKySX7wLgbcnbFnbq5mJXgqrUbwDUm23wqWkKE=;
 h=Date:From:To:Subject:From;
 b=iyiIk97GESSk67F9cVCCjqJA+V4zxHLgDP9+UFtcoKzPHzsRiEa5xpYlblj9dJms0
 JqYnsrrwNRXeEJcgI11Fw5ET1K0i/s2QKogDapVMEqLnMqyD56En4i2AhAyPVqg3UG
 w08l8++R+pG/jkNvGXMtNrlrb02FATV0/z92bhjSSchhki37c7al88r+5ZsUzqgV6r
 mgxmm/k5pVeYeV37C9fmQeu51ALpB1vlNuR06SQiA/jJrsyluSGo9Ktp9fW7nDXpV8
 Jr3EYxwRF4o0djyUs4e5Ky4wefrv+D/p/SmHqIrERIgqQvND/uEhxu30EBCWVKQtNR
 vA6x8WnddveSg==
Received: from madness.proulx.com (madness.proulx.com [192.168.230.122])
 by joseki.proulx.com (Postfix) with ESMTP id 8249996881
 for <tech-volunteer-meeting@fsf.org>; Sun, 30 Apr 2023 19:42:42 -0600 (MDT)
Received: by madness.proulx.com (Postfix, from userid 1000)
 id 71FDD11AA5; Sun, 30 Apr 2023 19:42:42 -0600 (MDT)
Date: Sun, 30 Apr 2023 19:42:42 -0600
From: Bob Proulx <bob@proulx.com>
To: tech-volunteer-meeting@fsf.org
Subject: Rebooting with systemd wedged
Message-ID: <20230430191724N@bob.proulx.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Received-SPF: pass client-ip=198.99.81.74; envelope-from=rwp@proulx.com;
 helo=havoc.proulx.com
X-Spam_score_int: -20
X-Spam_score: -2.1
X-Spam_bar: --
X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1,
 RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001,
 T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-BeenThere: tech-volunteer-meeting@fsf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <tech-volunteer-meeting.fsf.org>
List-Unsubscribe: <https://lists.fsf.org/mailman/options/tech-volunteer-meeting>, 
 <mailto:tech-volunteer-meeting-request@fsf.org?subject=unsubscribe>
List-Archive: <https://lists.fsf.org/archive/html/tech-volunteer-meeting>
List-Post: <mailto:tech-volunteer-meeting@fsf.org>
List-Help: <mailto:tech-volunteer-meeting-request@fsf.org?subject=help>
List-Subscribe: <https://lists.fsf.org/mailman/listinfo/tech-volunteer-meeting>, 
 <mailto:tech-volunteer-meeting-request@fsf.org?subject=subscribe>
X-List-Received-Date: Mon, 01 May 2023 01:42:50 -0000

Recently emacsconfmedia0p reported system package upgrade errors.
Looking into the problem the root cause seemed to be systemd having
gotten wedged.

    root@emacsconfmedia0p:~# systemctl status
    Failed to read server status: Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)

The system was not responsive.  I suspect it was accumulating zombies
if systemd is stuck.

    root@emacsconfmedia0p:~# systemctl reboot
    Failed to read server status: Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)

And of course shutdown and reboot both call systemctl now.  Could not
reboot it normally it since systemd is in charge of the reboot and
systemd is not responding.  Isn't systemd just such a wonderful
system?

To work around this I used the Linux Magic SysRq system to sync the
system and boot the kernel directly.  I have been looking for ways to
do this since systemd has a habit of getting stuck on the down side of
a reboot.  And if the system does not do the shut down part then it
can't do the boot up part.  The SysRq interface worked swimmingly.

    root@emacsconfmedia0p:~# echo 1 >/proc/sys/kernel/sysrq
    root@emacsconfmedia0p:~# echo s >/proc/sysrq-trigger
    Apr 29 12:39:00 emacsconfmedia0p kernel: sysrq: Emergency Sync
    root@emacsconfmedia0p:~# echo s >/proc/sysrq-trigger
    Apr 29 12:39:57 emacsconfmedia0p kernel: sysrq: Emergency Sync
    root@emacsconfmedia0p:~# echo b >/proc/sysrq-trigger
    [2593154.257181] sysrq: Resetting
    ...

The first enables the interface.  Because system usually disables most
of the features.  (I don't know why.)  Writing a 1 to sysrq enables
all of the features all at once.  Other values as a bitmask will
individually enable individual features.  But here it is simpler to
enable everything.

Then I triggered the file system sync.  Traditionally one does this
three times before rebooting.  Once is probably enough.  I did two out
of paranoia and a nod to tradition.

Then the last triggers the kernel to reboot immediately.  This avoids
the systemd shutdown entirely.  Worked perfectly.

Then having shutdown the system booted normally.  This is definitely
going to be my new go-to action to reach for when I want to boot
misbehaving systemd systems from now moving forward.  Maybe I will
install it as a script reboot-sysrq for ease of use.

    #!/bin/sh
    echo 1 >/proc/sys/kernel/sysrq
    echo s >/proc/sysrq-trigger
    echo b >/proc/sysrq-trigger

Bob


