Skip to content

Backport upstream fix for khungtaskd panic on stalled coredumps#575

Draft
manish1-arista wants to merge 1 commit into
sonic-net:202505from
manish1-arista:fix-khungtaskd-panic-on-stalled-coredump-202505
Draft

Backport upstream fix for khungtaskd panic on stalled coredumps#575
manish1-arista wants to merge 1 commit into
sonic-net:202505from
manish1-arista:fix-khungtaskd-panic-on-stalled-coredump-202505

Conversation

@manish1-arista
Copy link
Copy Markdown

On Linux 6.1, coredump_task_exit() parks sibling threads in TASK_UNINTERRUPTIBLE|TASK_FREEZABLE while one thread of the group writes the core file. Under sustained memory pressure the dump can take longer than kernel.hung_task_timeout_secs, at which point khungtaskd flags the parked siblings and (with hung_task_panic=1) panics the box.

Backport mainline v6.12 commit b8e753128ed0 which switches that wait to TASK_IDLE|TASK_FREEZABLE so the watchdog skips it.

On Linux 6.1, coredump_task_exit() parks sibling threads in
TASK_UNINTERRUPTIBLE|TASK_FREEZABLE while one thread of the group writes the core file.
Under sustained memory pressure the dump can take longer
than kernel.hung_task_timeout_secs, at which point khungtaskd flags the
parked siblings and (with hung_task_panic=1) panics the box.

Backport mainline v6.12 commit b8e753128ed0 ("exit: Sleep at TASK_IDLE
when waiting for application core dump") which switches that wait to
TASK_IDLE|TASK_FREEZABLE so the watchdog skips it.

Signed-off-by: manish1 <manish1@arista.com>
@mssonicbld
Copy link
Copy Markdown

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants