Implement kexec reboot support for SSH mode#20
Implement kexec reboot support for SSH mode#20aleksandrov-denis wants to merge 6 commits intorhkdump:mainfrom
Conversation
4ed3d67 to
6d12ccf
Compare
| log "Strategy: Performing kexec reboot (fast reboot)" | ||
| if ! kexec_load_kernel "$TESTED_KERNEL"; then | ||
| log "Falling back to full reboot" | ||
| kab_reboot |
There was a problem hiding this comment.
If you want to extend criu-daemon.sh to process a kexec reboot request, I guess CRIU should work as well.
There was a problem hiding this comment.
Fedora doesn't install kexec-tools by default. So kexec-tools needs to be installed as a dependency.
| return 1 | ||
| fi | ||
|
|
||
| cmdline=$(run_cmd cat /proc/cmdline) |
There was a problem hiding this comment.
/proc/cmdline contains BOOT_IMAGE=(hd0,gpt3)/vmlinuz-6.17.1-300.fc43.x86_64. I'm not sure how it will affect kexec rebooting. I think kexec --reuse-cmdline should be more robust.
|
Thanks for opening this PR! Good to know the rebooting speed has greatly improved. Besides suggestions in inline code comment, you can change one of the integration tests to adopt the kexec reboot strategy so this feature will be covered. Btw, you may want to link this PR to #7. @gemini-cli /review |
do_kexec_reboot() was a stub that always fell back to a full reboot. Implement it properly for SSH mode, where CRIU is not involved and kexec is safe to use. Add kexec_load_kernel() to lib.sh which validates that the kernel and initramfs files exist, reads the current command line from /proc/cmdline, and loads the new kernel into memory with kexec -l. Add kab_kexec() which executes the loaded kernel via reboot_and_wait. In local/CRIU mode, kexec bypasses the normal boot sequence that CRIU relies on to restart the daemon, so the existing fallback to full reboot is preserved. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The dnf install output was being redirected to /var/log/install.log on the controller machine (local shell redirect), which fails when the controller runs as a non-root user. Pass the redirect as an argument to run_cmd so it is evaluated on the test host, consistent with how install_from_git handles its build log. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two improvements based on review feedback: Use kexec --reuse-cmdline instead of reading /proc/cmdline manually. This avoids passing bootloader-specific parameters like BOOT_IMAGE= that kexec does not need and which can vary across boot environments. Extend kexec support to local/CRIU mode. kexec still performs a full Linux boot, so the CRIU daemon restarts via cron and can restore the bisect process normally. kab_kexec() now signals the daemon with a "kexec" checkpoint type, which triggers kexec -e after checkpointing. criu-daemon.sh is updated to recognise and allow kexec -e commands. do_kexec_reboot() no longer special-cases CRIU mode. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Set REBOOT_STRATEGY="kexec" in the SSH integration test so that the kexec code path gets exercised on every test run. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
kexec-tools was only installed as part of kdump setup, which only runs when TEST_STRATEGY="panic". A user running REBOOT_STRATEGY="kexec" with TEST_STRATEGY="simple" would get a missing kexec binary. Install kexec-tools at the start of setup_kdump() whenever REBOOT_STRATEGY="kexec" is set, independent of the test strategy. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2e7df94 to
15e3698
Compare
|
All comments should be resolved with the latest changes, let me know what you think :)) |
Set REBOOT_STRATEGY="kexec" so the CRIU integration test exercises the kexec+CRIU code path: the kernel is loaded via kexec -l and the CRIU daemon checkpoints kab before executing kexec -e, then restores it after the system comes back up. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
REBOOT_STRATEGY=kexec was documented and configurable but not implemented — do_kexec_reboot() was a
stub that always fell back to a full system reboot. This PR implements it for SSH mode, reducing
per-iteration reboot time from ~60s to ~18s.
Changes
Implement kexec support for SSH mode
Adds two helpers to lib.sh:
host, reads the current kernel command line from /proc/cmdline, and loads the new kernel into memory
with kexec -l
Updates do_kexec_reboot() in reboot_handler.sh:
otherwise executes with kab_kexec
boot sequence that the CRIU daemon relies on to restore the bisect process
install_from_rpm: redirect dnf output on the test host
Fixes a bug where dnf install output in install_from_rpm was redirected to /var/log/install.log via a
local shell redirect, which fails when the controller runs as a non-root user. The redirect is now
passed as an argument to run_cmd so it executes on the test host — consistent with how install_from_git
handles its build log.
Testing
Tested end-to-end on RHEL 9.8 in SSH mode (INSTALL_STRATEGY=rpm, TEST_STRATEGY=simple,
REBOOT_STRATEGY=kexec) bisecting across three CentOS Stream 9 kernel versions (5.14.0-687/688/689):
Resolves #7