Fix residual leak in downloadDirectory by skipping file downloads after cancellation.#6983
Fix residual leak in downloadDirectory by skipping file downloads after cancellation.#6983RanVaknin wants to merge 2 commits into
downloadDirectory by skipping file downloads after cancellation.#6983Conversation
… cancelled to prevent directory recreation
downloadDirectory by skipping file downloads after cancellation.
| if (returnFuture.isDone()) { | ||
| return CompletableFutureUtils.failedFuture( | ||
| SdkClientException.create("Download was cancelled before file could be started")); | ||
| } |
There was a problem hiding this comment.
It seems the race will still be there. Curious why existing request cancellation logic doesn't work. https://github.com/aws/aws-sdk-java-v2/blob/master/services-custom/s3-transfer-manager/src/main/java/software/amazon/awssdk/transfer/s3/internal/AsyncBufferingSubscriber.java#L56-L63
There was a problem hiding this comment.
Curious why existing request cancellation logic doesn't work.
I also was wondering about it. I saw directories being recreated and populated with new files after the cancellation hook fired. The only explanation I could think of is that the race is happening between a thread that has that cancellation signal and one that doesnt.
- If a new thread B enters
onNext()and hitconsumer.apply(item) - Thread A sends the cancellation signal
requestsInFlight.forEach(f -> f.cancel(true)) - Thread B hits
requestsInFlight.add(currentRequest)
In this case thread A cancellation wont stop thread B because it didnt have a time to register the request future with the requestsInFlight set.
It seems the race will still be there
what do you mean?
In #6875, we addressed canceling in flight transfers when a directory download is cancelled. That fix reduced the leaked files per orphaned directory from thousands to single digits in stress test. However there was a residual leak for downloads that were already dispatched to
doDownloadSingleFile(the method that creates the destination directory and initiates the file download) before the cancellation signal propagated.When cancellation occurs, thread A calls
subscription.cancel()to stop new items from being delivered. Meanwhile, thread B has already picked up an item fromonNextand is insidedoDownloadSingleFile. Since it entered before thread A's cancellation took effect, it creates the destination directory and starts a download into it.This PR adds a
isDone()guard at the top ofdoDownloadSingleFileso that any thread that entered the method before cancellation took effect will return early before touching the filesystem.The two tests:
Updated
downloadDirectory_cancel_shouldCancelAllFutures- now waits for both downloads to start before cancelling. The original test calledcancel()immediately afterdownloadDirectory(), implicitly assuming all downloads would always start regardless of cancellation state. With the newisDone()guard, that assumption is not always true since cancel can now prevent downloads from starting. The test's intent is unchanged (in flight futures get cancelled), we just make sure both futures are actually in flight before cancelling. (Sanity check tested this with a repeated test suite 1000 times to make sure its deterministic)Added
downloadDirectory_cancelledFuture_shouldNotCreateDirectories- The test cancels the directory download future, then delivers an S3 object through the publisher. Asserts that the destination subdirectory was never created on the filesystem. Without the fix,doDownloadSingleFilewould callcreateParentDirectoriesIfNeeded()and create the directory despite the cancellation.