feat(core): add conditional rename option#7815
Conversation
|
@Xuanwo @erickguan Hi, could you please reivew this PR when have free time? Thanks! |
erickguan
left a comment
There was a problem hiding this comment.
No objection to add a toggle to a service. But a comment on your function design.
minor (non-blocking): perhaps we could have better names or options to confirm delete target objects in the long run.
| } | ||
|
|
||
| #[allow(deprecated)] | ||
| impl Default for HdfsConfig { |
There was a problem hiding this comment.
minor(non-blocking): we probably don't need Default in any of services.
There was a problem hiding this comment.
@erickguan Thanks very much for reviewing. nice advice, have fixed and pushed.
| } | ||
|
|
||
| #[derive(Debug, Clone, Copy, PartialEq, Eq)] | ||
| enum HdfsRenameTargetAction { |
There was a problem hiding this comment.
I would see either we have an enum to raise error on the callsite of hdfs_rename_existing_file_action. Otherwise, is returning Ok() enough?
| pub enable_append: bool, | ||
| /// atomic_write_dir of this backend | ||
| pub atomic_write_dir: Option<String>, | ||
| /// Whether HDFS rename should overwrite an existing target file. |
There was a problem hiding this comment.
I think this feature is focusing on the wrong thing. The semantics of OpenDAL are clear: rename should overwrite, just like the file system does. Therefore, HDFS should not alter the behavior itself. Instead, it should:
- Enable the atomic rename if HDFS supports it
- Fall back to delete-then-rename if HDFS does not support it
There was a problem hiding this comment.
Actually, HDFS supports atomic rename and suggests to use Options.Rename to perform rename operations which are defined as below:
https://github.com/apache/hadoop/blob/10bb684bb4d5cb8edbd2ce79a5d695592cef32c5/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/Options.java#L226
IIUC, should i add a new operation interface atomic_rename_if_not_exists here and use it in lance?
There was a problem hiding this comment.
I see. I believe the correct thing we should do here is to add a rename_if_not_exists for opendal.
3be5d77 to
c518749
Compare
|
@erickguan @Xuanwo Hi, have pushed latest changes. Could you please help review it when have free time? Thanks ! |
|
If we want to extend an API, we would benefit from a small RFC for future reference. Could you build one RFC too? |
Very nice advice. Have added a RFC. BTW. I think we can add a rule in AGENTS.md to constrain it. What's your opinions? |
|
@erickguan @Xuanwo All checks passed. please cc. Thanks! |
There was a problem hiding this comment.
RFC should live in a dedicated PR.
|
|
||
| async fn publish(op: Operator) -> Result<()> { | ||
| match op | ||
| .rename_if_not_exists("staging/file", "published/file") |
There was a problem hiding this comment.
We shouldn't provide such API, please check with how copy_if_not_exists works.
| } | ||
| ``` | ||
|
|
||
| The `Access::rename` signature does not change. Services inspect `OpRename` to |
| The API also exposes a semantic operation that some backends cannot support, so | ||
| portable callers must handle `Unsupported` or inspect the capability. | ||
|
|
||
| # Rationale and alternatives |
There was a problem hiding this comment.
Please check with other services like fs, s3 and check how rename works for them.
1d58950 to
77ed302
Compare
b3e66e3 to
0a9a84d
Compare
|
@Xuanwo @erickguan Hi, could you help review this PR when have free time? Thanks very much! |
erickguan
left a comment
There was a problem hiding this comment.
Generally good. Some minor comments.
I don't understand why do we need a tuple for options and destination. Can you explain?
| /// use opendal_core::options::RenameOptions; | ||
| /// use opendal_core::Result; | ||
| /// | ||
| /// fn test(op: blocking::Operator) -> Result<()> { |
There was a problem hiding this comment.
| /// fn test(op: blocking::Operator) -> Result<()> { | |
| /// fn rename_with_options(op: blocking::Operator) -> Result<()> { |
| /// use opendal_core::Operator; | ||
| /// use opendal_core::Result; | ||
| /// | ||
| /// async fn test(op: Operator) -> Result<()> { |
There was a problem hiding this comment.
| /// async fn test(op: Operator) -> Result<()> { | |
| /// async fn rename_with_options(op: Operator) -> Result<()> { |
| ctx: OperationContext, | ||
| srv: Servicer, | ||
| from: String, | ||
| (opts, to): (options::RenameOptions, String), |
There was a problem hiding this comment.
Eh, what is so special with options and destination? Do we need a tuple?
| ErrorKind::ConditionNotMatch, | ||
| "target path already exists while if_not_exists is set", | ||
| ) | ||
| .with_context("input", to_path) |
|
|
||
| Refer to [`HdfsBuilder`]'s public API docs for more information. | ||
|
|
||
| ### Rename Behavior |
There was a problem hiding this comment.
We don't need documentation on this. Users will refer to operator's documentation.
|
|
||
| /// Rename to a nonexistent path should succeed when if_not_exists is set. | ||
| pub async fn test_rename_with_if_not_exists(op: Operator) -> Result<()> { | ||
| let parent = format!("{}/", uuid::Uuid::new_v4()); |
There was a problem hiding this comment.
No need for parent since we have uuid here.
|
|
||
| op.write(&source_path, source_content.clone()).await?; | ||
|
|
||
| let target_path = format!("{parent}target"); |
There was a problem hiding this comment.
What is the behavior to rename files into a destination without destination's parent folder? It might be tested but I didn't check.
| pub async fn test_rename_with_if_not_exists_returns_condition_not_match( | ||
| op: Operator, | ||
| ) -> Result<()> { | ||
| let parent = format!("{}/", uuid::Uuid::new_v4()); |
Which issue does this PR close?
N/A.
Rationale for this change
OpenDAL's default
renamecontract overwrites an existing destination. Somecallers need an atomic publish operation that moves a source only when the
destination does not exist. A separate
statfollowed byrenamecannotprovide this guarantee because another writer may create the destination
between the two operations.
This implementation follows the options-based API accepted in
RFC #7818.
What changes are included in this PR?
RenameOptions,Operator::rename_with(...).if_not_exists(true), andblocking
rename_options.OpRename::if_not_existsandCapability::rename_with_if_not_exists.Unsupportedwhen a service does notadvertise the capability.
operation.
ConditionNotMatch, including conflictsreported by the final native rename, while leaving both paths unchanged.
rename.user-facing documentation.
Are there any user-facing changes?
Yes. Rust users can request an atomic no-replace rename with:
Normal
renamebehavior remains unchanged. HDFS initially advertises the newcapability; other services return
Unsupportedunless they implement andadvertise the same atomic guarantee.
AI Usage Statement
OpenAI Codex (GPT-5) was used to prepare and review this PR.