Fix division by zero in clDice loss harmonic mean#8739
Conversation
…c mean The clDice loss computes the harmonic mean of topology precision (tprec) and sensitivity (tsens) as `2 * tprec * tsens / (tprec + tsens)`. When both values are zero -- which occurs with empty inputs, non-overlapping predictions, or when using smooth=0 -- this produces NaN due to 0/0 division. Add a small epsilon (1e-7) to the harmonic mean denominator to prevent division by zero. This has negligible effect on normal computation since tprec and tsens are bounded in [0, 1], but prevents NaN loss values that would otherwise crash training of tubular structure segmentation models. Also add test cases covering zero-input and non-overlapping edge cases with smooth=0 to verify the fix. Signed-off-by: Mr-Neutr0n <64578610+Mr-Neutr0n@users.noreply.github.com>
📝 WalkthroughWalkthroughThe change adds a new small regularizer parameter, smooth_dr (default 1e-7), to soft_dice and to the constructors/state of SoftclDiceLoss and SoftDiceclDiceLoss. All forward computations now include smooth_dr in Dice/clDice denominators (e.g., tprec, tsens, and combined denominators). Two unit tests were added to ensure no NaNs for zero inputs and non-overlapping predictions when smooth=0.0. Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Hi @Mr-Neutr0n thanks for this change, there should be a value to prevent divide-by-zero but I would suggest following the pattern found with the other Dice losses. I would add a |
The previous fix only added a hardcoded 1e-7 to the harmonic mean denominator in cl_dice, but the same 0/0 division bug exists in the individual tprec and tsens computations (and in soft_dice, which is used by SoftDiceclDiceLoss). When smooth=0 and the skeleton is all zeros, each ratio becomes 0/0 = NaN. Following the pattern used by other Dice losses in monai.losses.dice.DiceLoss, add a smooth_dr constructor argument (default 1e-7) to SoftclDiceLoss, SoftDiceclDiceLoss, and the soft_dice helper, and use it in the tprec, tsens, soft_dice, and harmonic mean denominators. This keeps the API consistent with the rest of the losses module and lets users tune the guard if needed. The new test cases for zero-input and non-overlapping inputs (smooth=0) now pass. Signed-off-by: Mr-Neutr0n <64578610+Mr-Neutr0n@users.noreply.github.com>
0e408ee to
746afac
Compare
|
Thanks @ericspod — the
The original commit only patched the harmonic-mean denominator, but the same 0/0 bug existed in the individual Verified locally — all 5 tests in Latest push: |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
monai/losses/cldice.py (1)
95-106:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAdd one explicit
smooth_drregression test.
tests/losses/test_cldice_loss.pyonly exercises the defaultsmooth_dr, so a wiring regression in these new public parameters would still pass. Please add one case that sets a non-defaultsmooth_drand asserts the forward path stays finite and uses that value.As per coding guidelines, "Ensure new or modified definitions will be covered by existing or new unit tests."
Also applies to: 131-143, 169-182
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@monai/losses/cldice.py` around lines 95 - 106, Add a regression unit test in tests/losses/test_cldice_loss.py that calls the public soft_dice API with an explicit non-default smooth_dr (e.g., 1e-3) and verifies the forward output is finite (no NaN/inf) and that the smooth_dr value actually affects the result by comparing outputs for two different smooth_dr values (they should differ). Do the same for the other public clDice-related forward functions referenced in the diff (the functions around lines 131-143 and 169-182) so each new/modified parameter is exercised and asserted to produce finite outputs and to change when smooth_dr changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@monai/losses/cldice.py`:
- Around line 175-176: Update the docstring for the parameter smooth_dr in
clDice (referencing smooth_dr and the harmonic mean computation using tprec and
tsens) to explicitly state that smooth_dr is added not only to each individual
ratio denominator but also to the harmonic-mean denominator (the final tprec +
tsens + self.smooth_dr term) so the docs match the implementation; locate the
parameter description in the clDice/clDiceLoss docstring and reword it to
mention both the per-ratio denominators and the combined harmonic denominator.
---
Outside diff comments:
In `@monai/losses/cldice.py`:
- Around line 95-106: Add a regression unit test in
tests/losses/test_cldice_loss.py that calls the public soft_dice API with an
explicit non-default smooth_dr (e.g., 1e-3) and verifies the forward output is
finite (no NaN/inf) and that the smooth_dr value actually affects the result by
comparing outputs for two different smooth_dr values (they should differ). Do
the same for the other public clDice-related forward functions referenced in the
diff (the functions around lines 131-143 and 169-182) so each new/modified
parameter is exercised and asserted to produce finite outputs and to change when
smooth_dr changes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: da0bf21b-1698-4174-88d1-fd388e9fb0d2
📒 Files selected for processing (1)
monai/losses/cldice.py
| smooth_dr: Small epsilon added to each ratio denominator to prevent | ||
| 0/0 when smooth=0 and inputs are entirely zero. Defaults to 1e-7. |
There was a problem hiding this comment.
Document the harmonic-mean denominator here too.
This docstring says smooth_dr is added to “each ratio denominator”, but Line 194 also applies it to the final tprec + tsens + self.smooth_dr term. The parameter docs should match the implementation.
As per coding guidelines, "Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@monai/losses/cldice.py` around lines 175 - 176, Update the docstring for the
parameter smooth_dr in clDice (referencing smooth_dr and the harmonic mean
computation using tprec and tsens) to explicitly state that smooth_dr is added
not only to each individual ratio denominator but also to the harmonic-mean
denominator (the final tprec + tsens + self.smooth_dr term) so the docs match
the implementation; locate the parameter description in the clDice/clDiceLoss
docstring and reword it to mention both the per-ratio denominators and the
combined harmonic denominator.
Summary
SoftclDiceLossandSoftDiceclDiceLosswhen computing the harmonic mean of topology precision and sensitivity1e-7) to the denominator(tprec + tsens)to preventNaNwhen both values are zerosmooth=0Details
The clDice loss computes
cl_dice = 1.0 - 2.0 * (tprec * tsens) / (tprec + tsens). When bothtprecandtsensare zero (e.g., empty inputs, non-overlapping predictions/targets, orsmooth=0), this results in0/0 = NaN, which propagates through the loss and crashes training.While the default
smooth=1.0preventstprecandtsensfrom being exactly zero in most cases, settingsmooth=0(a valid configuration) exposes this bug whenever skeleton overlap is zero. The fix adds1e-7to the harmonic mean denominator, which:cl_dice = 1.0(maximum loss) when both precision and sensitivity are zero, which is the correct semantic resultsmooth_drinDiceLoss)Test plan
test_cldice_loss.pytests still pass (perfect overlap cases)test_zero_input_no_nan: verifies zero-valued inputs withsmooth=0do not produce NaNtest_no_overlap_no_nan: verifies non-overlapping predictions/targets withsmooth=0do not produce NaN