Skip to content

HDDS-15651. Test case for DiskBalancer when markContainerForDelete fails#10593

Open
arunsarin85 wants to merge 2 commits into
apache:masterfrom
arunsarin85:HDDS-15651
Open

HDDS-15651. Test case for DiskBalancer when markContainerForDelete fails#10593
arunsarin85 wants to merge 2 commits into
apache:masterfrom
arunsarin85:HDDS-15651

Conversation

@arunsarin85

@arunsarin85 arunsarin85 commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Test-only PR for HDDS-15651. Adds two unit tests in TestDiskBalancerTask to document the intended DiskBalancer move/cleanup behavior when markContainerForDelete() fails or when lazy deletion fails.

Please describe your PR in detail:
DiskBalancer treats container move and source cleanup as separate phases. Once import and ContainerSet update succeed, the move is reported as success even if marking the old source replica fails. The old replica is queued in pendingDeletionContainers and removed after replica.deletion.delay.

This PR adds tests to lock in that behavior and document a known gap when lazy deletion fails.

Test 1: moveSucceedsWhenMarkContainerForDeleteFails

  • Simulates markContainerForDelete() failure on the source replica after a successful move.
  • Verifies the move is still reported as success (success metrics updated, no rollback).
  • Verifies ContainerSet points to the destination replica.
  • Verifies the source replica stays on disk temporarily and is queued for lazy deletion.
  • After the delay, verifies the source replica is removed via cleanupPendingDeletionContainers().

Test 2: lazyDeletionFailureDoesNotRetry

  • Runs a successful move and advances the clock past the deletion delay.
  • Mocks KeyValueContainerUtil.removeContainer() to fail during lazy deletion.
  • Verifies the source replica remains on disk, the pending queue entry is dropped, and deletion is not retried on a second cleanup attempt.
  • Documents current behavior when lazy deletion fails (recovery depends on other paths such as DN restart for Ratis).

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-15651

How was this patch tested?

mvn test -pl hadoop-hdds/container-service -am
-Dtest=TestDiskBalancerTask#moveSucceedsWhenMarkContainerForDeleteFails,TestDiskBalancerTask#lazyDeletionFailureDoesNotRetry
-DfailIfNoTests=false -Dsurefire.failIfNoSpecifiedTests=false

@arunsarin85 arunsarin85 marked this pull request as draft June 24, 2026 15:40

@Gargi-jais11 Gargi-jais11 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @arunsarin85 for raisin the concern. I have left comments below to discuss on this.

readLockReleased = true;
try {
container.markContainerForDelete();
moveSucceeded = true;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the time markContainerForDelete() runs, the expensive part is already done:

  • container copied to destination
  • import completed
  • ContainerSet updated to the new replica
  • destination used space incremented

If mark fails and we roll back, we would:

r- restore ContainerSet to the source replica

  • delete the destination directory
  • revert volume accounting
  • report the move as failed

That means we throw away a valid destination copy and redo the whole move later. For large containers, that is a lot of wasted I/O.

Why the current behavior is acceptable?

The move and cleanup are intentionally separate:

Move phase — copy + import + ContainerSet update
Cleanup phase — mark/delete source replica (with lazy deletion for in-flight reads)
If phase 1 succeeds, the container is effectively moved. Source cleanup is a follow-up step.

Also, even when mark fails:

  • the old replica is still queued in pendingDeletionContainers
  • deleteContainer() → removeContainer() does not require DELETED state
  • for Ratis, HDDS-9322 cleans up duplicates on DN restart
  • So this is mostly an operational/accounting issue, not a data-loss issue for Ratis.

cc: @ChenSammi

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arunsarin85
I don’t think we should fail and fully roll back a completed move just because source cleanup failed. That adds heavy work and can make DiskBalancer less effective. Existing lazy deletion + dn restart already cover most of the cleanup path for Ratis. For EC we issue can be there as the Pr is not merged yet.

Comment on lines +617 to +621
if (diskBalancerDestDir != null) {
try {
FileUtils.deleteDirectory(diskBalancerDestDir.toFile());
} catch (IOException ex) {
LOG.warn("Failed to delete destination replica during rollback for container {}",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the time markContainerForDelete() runs, copy/import/ContainerSet update are already done. Just deleting the new destination container is not a roll back. As ContainerSet is already pointing to new container on destination disk.

@arunsarin85 arunsarin85 marked this pull request as ready for review June 25, 2026 08:10
@arunsarin85

Copy link
Copy Markdown
Contributor Author

Thanks @Gargi-jais11 for the comments !
As per the design and feature flow explained in the jira https://issues.apache.org/jira/browse/HDDS-15651
I have modified this PR to be a test only [added 2 additional tests.] . Will modify the description accordingly.

@adoroszlai adoroszlai changed the title HDDS-15651. Roll back DiskBalancer move when markContainerForDelete fails HDDS-15651. Test case for DiskBalancer when markContainerForDelete fails Jun 25, 2026
@adoroszlai adoroszlai requested a review from Gargi-jais11 June 25, 2026 09:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants