Skip to content

feat(core): add conditional rename option#7815

Open
hfutatzhanghb wants to merge 1 commit into
apache:mainfrom
hfutatzhanghb:codex/hdfs-rename-overwrite-config
Open

feat(core): add conditional rename option#7815
hfutatzhanghb wants to merge 1 commit into
apache:mainfrom
hfutatzhanghb:codex/hdfs-rename-overwrite-config

Conversation

@hfutatzhanghb

@hfutatzhanghb hfutatzhanghb commented Jun 23, 2026

Copy link
Copy Markdown
Member

Which issue does this PR close?

N/A.

Rationale for this change

OpenDAL's default rename contract overwrites an existing destination. Some
callers need an atomic publish operation that moves a source only when the
destination does not exist. A separate stat followed by rename cannot
provide this guarantee because another writer may create the destination
between the two operations.

This implementation follows the options-based API accepted in
RFC #7818.

What changes are included in this PR?

  • Add RenameOptions, Operator::rename_with(...).if_not_exists(true), and
    blocking rename_options.
  • Add raw OpRename::if_not_exists and
    Capability::rename_with_if_not_exists.
  • Reject conditional rename with Unsupported when a service does not
    advertise the capability.
  • Enable the capability for HDFS using its native no-overwrite rename
    operation.
  • Map an existing HDFS destination to ConditionNotMatch, including conflicts
    reported by the final native rename, while leaving both paths unchanged.
  • Preserve the existing overwrite behavior of normal HDFS rename.
  • Add correctness checks, behavior tests, HDFS error-mapping coverage, and
    user-facing documentation.

Are there any user-facing changes?

Yes. Rust users can request an atomic no-replace rename with:

op.rename_with("source", "destination")
    .if_not_exists(true)
    .await?;

Normal rename behavior remains unchanged. HDFS initially advertises the new
capability; other services return Unsupported unless they implement and
advertise the same atomic guarantee.

AI Usage Statement

OpenAI Codex (GPT-5) was used to prepare and review this PR.

@hfutatzhanghb hfutatzhanghb marked this pull request as ready for review June 23, 2026 03:14
@hfutatzhanghb hfutatzhanghb requested a review from Xuanwo as a code owner June 23, 2026 03:14
@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. releases-note/feat The PR implements a new feature or has a title that begins with "feat" labels Jun 23, 2026
@hfutatzhanghb hfutatzhanghb requested a review from tisonkun as a code owner June 23, 2026 03:31
@hfutatzhanghb

Copy link
Copy Markdown
Member Author

@Xuanwo @erickguan Hi, could you please reivew this PR when have free time? Thanks!

@erickguan erickguan left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No objection to add a toggle to a service. But a comment on your function design.

minor (non-blocking): perhaps we could have better names or options to confirm delete target objects in the long run.

Comment thread core/services/hdfs/src/config.rs Outdated
}

#[allow(deprecated)]
impl Default for HdfsConfig {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor(non-blocking): we probably don't need Default in any of services.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@erickguan Thanks very much for reviewing. nice advice, have fixed and pushed.

Comment thread core/services/hdfs/src/core.rs Outdated
}

#[derive(Debug, Clone, Copy, PartialEq, Eq)]
enum HdfsRenameTargetAction {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would see either we have an enum to raise error on the callsite of hdfs_rename_existing_file_action. Otherwise, is returning Ok() enough?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have fixed.

@dosubot dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Jun 23, 2026
@hfutatzhanghb hfutatzhanghb requested a review from erickguan June 23, 2026 05:46
@dosubot dosubot Bot added the lgtm This PR has been approved by a maintainer label Jun 23, 2026
Comment thread core/services/hdfs/src/config.rs Outdated
pub enable_append: bool,
/// atomic_write_dir of this backend
pub atomic_write_dir: Option<String>,
/// Whether HDFS rename should overwrite an existing target file.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this feature is focusing on the wrong thing. The semantics of OpenDAL are clear: rename should overwrite, just like the file system does. Therefore, HDFS should not alter the behavior itself. Instead, it should:

  • Enable the atomic rename if HDFS supports it
  • Fall back to delete-then-rename if HDFS does not support it

@hfutatzhanghb hfutatzhanghb Jun 23, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, HDFS supports atomic rename and suggests to use Options.Rename to perform rename operations which are defined as below:
https://github.com/apache/hadoop/blob/10bb684bb4d5cb8edbd2ce79a5d695592cef32c5/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/Options.java#L226

IIUC, should i add a new operation interface atomic_rename_if_not_exists here and use it in lance?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I believe the correct thing we should do here is to add a rename_if_not_exists for opendal.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it !

@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed lgtm This PR has been approved by a maintainer size:M This PR changes 30-99 lines, ignoring generated files. labels Jun 23, 2026
@hfutatzhanghb hfutatzhanghb changed the title feat(services/hdfs): add rename overwrite config feat(core): add rename if not exists Jun 23, 2026
@erickguan erickguan self-requested a review June 23, 2026 11:28
@hfutatzhanghb hfutatzhanghb force-pushed the codex/hdfs-rename-overwrite-config branch from 3be5d77 to c518749 Compare June 23, 2026 12:43
@hfutatzhanghb

Copy link
Copy Markdown
Member Author

@erickguan @Xuanwo Hi, have pushed latest changes. Could you please help review it when have free time? Thanks !

@erickguan

erickguan commented Jun 23, 2026

Copy link
Copy Markdown
Member

If we want to extend an API, we would benefit from a small RFC for future reference. Could you build one RFC too?

@hfutatzhanghb

Copy link
Copy Markdown
Member Author

If we want to extend an API, we would benefit from a small RFC for future reference. Could you build one RFC too?

Very nice advice. Have added a RFC. BTW. I think we can add a rule in AGENTS.md to constrain it. What's your opinions?

@hfutatzhanghb hfutatzhanghb marked this pull request as draft June 24, 2026 02:46
@hfutatzhanghb hfutatzhanghb marked this pull request as ready for review June 24, 2026 02:46
@hfutatzhanghb

Copy link
Copy Markdown
Member Author

@erickguan @Xuanwo All checks passed. please cc. Thanks!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RFC should live in a dedicated PR.


async fn publish(op: Operator) -> Result<()> {
match op
.rename_if_not_exists("staging/file", "published/file")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't provide such API, please check with how copy_if_not_exists works.

}
```

The `Access::rename` signature does not change. Services inspect `OpRename` to

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have renamed Access trait.

The API also exposes a semantic operation that some backends cannot support, so
portable callers must handle `Unsupported` or inspect the capability.

# Rationale and alternatives

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check with other services like fs, s3 and check how rename works for them.

@hfutatzhanghb hfutatzhanghb changed the title feat(core): add rename if not exists feat(core): add conditional rename option Jun 24, 2026
@hfutatzhanghb hfutatzhanghb force-pushed the codex/hdfs-rename-overwrite-config branch from 1d58950 to 77ed302 Compare June 25, 2026 08:18
@hfutatzhanghb hfutatzhanghb force-pushed the codex/hdfs-rename-overwrite-config branch from b3e66e3 to 0a9a84d Compare June 25, 2026 08:37
@hfutatzhanghb hfutatzhanghb requested a review from Xuanwo June 26, 2026 01:57
@hfutatzhanghb

Copy link
Copy Markdown
Member Author

@Xuanwo @erickguan Hi, could you help review this PR when have free time? Thanks very much!

@erickguan erickguan left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally good. Some minor comments.

I don't understand why do we need a tuple for options and destination. Can you explain?

/// use opendal_core::options::RenameOptions;
/// use opendal_core::Result;
///
/// fn test(op: blocking::Operator) -> Result<()> {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// fn test(op: blocking::Operator) -> Result<()> {
/// fn rename_with_options(op: blocking::Operator) -> Result<()> {

/// use opendal_core::Operator;
/// use opendal_core::Result;
///
/// async fn test(op: Operator) -> Result<()> {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// async fn test(op: Operator) -> Result<()> {
/// async fn rename_with_options(op: Operator) -> Result<()> {

ctx: OperationContext,
srv: Servicer,
from: String,
(opts, to): (options::RenameOptions, String),

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, what is so special with options and destination? Do we need a tuple?

ErrorKind::ConditionNotMatch,
"target path already exists while if_not_exists is set",
)
.with_context("input", to_path)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is input too vague here?


Refer to [`HdfsBuilder`]'s public API docs for more information.

### Rename Behavior

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need documentation on this. Users will refer to operator's documentation.


/// Rename to a nonexistent path should succeed when if_not_exists is set.
pub async fn test_rename_with_if_not_exists(op: Operator) -> Result<()> {
let parent = format!("{}/", uuid::Uuid::new_v4());

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for parent since we have uuid here.


op.write(&source_path, source_content.clone()).await?;

let target_path = format!("{parent}target");

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the behavior to rename files into a destination without destination's parent folder? It might be tested but I didn't check.

pub async fn test_rename_with_if_not_exists_returns_condition_not_match(
op: Operator,
) -> Result<()> {
let parent = format!("{}/", uuid::Uuid::new_v4());

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to have parent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

releases-note/feat The PR implements a new feature or has a title that begins with "feat" size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants