Skip to content

[wasm-reduce] Empty functions with delta debugging#8640

Open
tlively wants to merge 1 commit intomainfrom
delta-debugging
Open

[wasm-reduce] Empty functions with delta debugging#8640
tlively wants to merge 1 commit intomainfrom
delta-debugging

Conversation

@tlively
Copy link
Copy Markdown
Member

@tlively tlively commented Apr 22, 2026

Delta debugging is an algorithm for finding the minimal set of items necessary to preserve a condition. It generally works by using increasingly fine partitions of the orignal set of items and alternating trying to keep just one of the partitions to make rapid progress and trying to keep the complement of one of the partitions to make smaller changes that are more likely to work.

Add a header containing a templatized delta debugging implementation, then use it in wasm-reduce to preserve the minimal number of function bodies necessary to reproduce the reduction condition. This should allow wasm-reduce to make much faster progress on emptying out functions in the common case and leave it much less work to do afterwards.

Using delta debugging for deleting functions and performing other reduction operations is left as future work. Deleting functions in particular is challenging because it can involve reloading the module from the working file, potentially changing function names and invalidating the function names that would be stored in the delta debugging partitions.

Delta debugging is an algorithm for finding the minimal set of items necessary to preserve a condition. It generally works by using increasingly fine partitions of the orignal set of items and alternating trying to keep just one of the partitions to make rapid progress and trying to keep the complement of one of the partitions to make smaller changes that are more likely to work.

Add a header containing a templatized delta debugging implementation, then use it in wasm-reduce to preserve the minimal number of function bodies necessary to reproduce the reduction condition. This should allow wasm-reduce to make much faster progress on emptying out functions in the common case and leave it much less work to do afterwards.

Using delta debugging for deleting functions and performing other reduction operations is left as future work. Deleting functions in particular is challenging because it can involve reloading the module from the working file, potentially changing function names and invalidating the function names that would be stored in the delta debugging partitions.
@tlively tlively requested a review from a team as a code owner April 22, 2026 05:42
@tlively tlively requested review from kripken and stevenfontanella and removed request for a team April 22, 2026 05:42
@tlively
Copy link
Copy Markdown
Member Author

tlively commented Apr 22, 2026

Currently validating this approach overnight by reducing a 200MB file with a reduction script that takes over four minutes to crash. Let's see how far it gets by the morning!

[&](Index partitionIndex,
Index numPartitions,
const std::vector<Index>& partition) {
std::cerr << "| try partition " << partitionIndex + 1 << " / "
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why add 1 here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Printing 1-based indices is slightly more intuitive than 0-based indices.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree to disagree 😄

Copy link
Copy Markdown
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! lgtm % you find it is faster

@tlively
Copy link
Copy Markdown
Member Author

tlively commented Apr 22, 2026

In practice I had to do additional hacks to make sure this ran before the usual destructive visitModule, since that starts out by trying to remove individual instructions and does not make fast enough progress. But it works!

I'd like to make two changes here before landing this:

  1. I want to stop delta debugging early when the partition size is less than the square root of the last successful partition size. This will prevent wasting significant time going through tiny partitions when switching to a different reduction strategy (e.g. running passes or destructively removing instructions) might make more progress.
    1. I can do this early exit with exceptions (gross) or by turning the delta debugging implementation into an iterator (also gross). The ideal end state would be to do the latter using coroutines, but until we're ready for that I will probably just use an exception.
  2. I'd like to be more precise and efficient by excluding functions that already have emptied bodies. Right now if there are lots of functions with empty bodies, we waste time by including them in partitions and trying to empty them out again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants