Skip to content

feat: contrib Delta executor-side Rust (kernel read + deletion vectors) [Delta contrib split, part 3b]#6

Draft
schenksj wants to merge 2 commits into
pr/delta-A3a-rust-driverfrom
pr/delta-A3b-rust-executor
Draft

feat: contrib Delta executor-side Rust (kernel read + deletion vectors) [Delta contrib split, part 3b]#6
schenksj wants to merge 2 commits into
pr/delta-A3a-rust-driverfrom
pr/delta-A3b-rust-executor

Conversation

@schenksj

Copy link
Copy Markdown
Owner

Fork-local review draft (Delta-contrib PR split, part 3b). Base is pr/delta-A3a-rust-driver so the diff shows only A.3b. Completes the Rust side; stacks on parts 1 (apache#4700), 2, 3a. Tracking umbrella: apache#4366.

What this part is

The Rust executor side that completes the contrib Delta native crate. It replaces the build-gate stub planner with the real one, so a -Pcontrib-delta build now does end-to-end native Delta reads: given a scan task, read through delta-kernel-rs, apply the transform + deletion vectors.

  • planner.rs — replaces the stub: assembles the per-task DataSourceExec (kernel scan + partition values + DV filter), reached by the unchanged core dispatch shim.
  • kernel_scan.rs — the kernel read bridge (schema resolution, column-mapping, row-tracking, transform). plannerkernel_scan are mutually dependent and ship together.
  • dv_reader.rs — Delta deletion-vector decode (inline + on-disk), surfaced as a DataFusion filter; a missing DV file maps to SparkError::FileNotFound for JVM error parity.
  • lib.rs / Cargo.toml — re-add the executor module decls, proto re-exports, and deps deferred in 3a.

Why it's contained

Core is untouched — the dispatch shim is byte-identical; it now reaches the real planner instead of the stub. A.3b touches only contrib/delta/native + the native lockfile. Default (non-contrib-delta) builds still carry zero Delta surface (the gate-verify script confirms 0 Delta symbols in the default libcomet; the contrib build is ~13 MB larger).

Verification

gated native build, 89 in-crate unit tests (54 driver + 35 executor), default native build unchanged, clippy (both feature states), the gate-verify script, and cargo fmt — all green. The review pass also removed four dead dependencies the integration branch carried (roaring, datafusion-datasource, a direct parquet dep, and datafusion's parquet feature — none referenced by the code; all parquet I/O flows through delta_kernel).


🤖 AI disclosure: this PR was prepared with assistance from Claude Code (Claude Opus 4.8), under the submitter's review and direction.

schenksj and others added 2 commits June 29, 2026 09:23
…s) [Delta contrib split, part 3b]

Part 3b of the Delta Lake contrib PR breakup (tracking: apache#4366). Completes the contrib
native crate: the executor-side read path replaces the build-gate stub planner, so a
`-Pcontrib-delta` build now does end-to-end native Delta reads (given a scan task, read
through delta-kernel-rs, apply the transform + deletion vectors).

- `planner.rs` - replaces the stub: assembles the per-task `DataSourceExec` (parquet scan +
  partition values + DV filter), wired to the core dispatch shim's `plan_delta_scan` call.
- `kernel_scan.rs` - the kernel read bridge (`planner` <-> `kernel_scan` are mutually
  dependent and ship together): schema resolution, column-mapping, row-tracking, transform.
- `dv_reader.rs` - Delta deletion-vector decode (inline + on-disk roaring bitmaps), surfaced
  as a DataFusion filter; missing-DV-file maps to SparkError::FileNotFound for parity.
- `lib.rs` - re-adds the `dv_reader`/`kernel_scan` module decls and the
  `DeltaScan`/`DeltaScanCommon` proto re-exports trimmed in 3a.
- `Cargo.toml` - re-adds the executor deps (parquet, roaring, datafusion-datasource, futures,
  chrono*, comet-common, tokio dev-dep) deferred from 3a.

Core is untouched -- the dispatch shim is unchanged; it now reaches the real planner instead
of the stub. The native crate is now equivalent to the integration branch (modulo the crate
version, kept at 0.18.0, and a clarified planner doc-link). Default builds still carry zero
Delta surface.

Verification: gated native build, 89 in-crate unit tests (54 driver + 35 executor), default
native build unchanged, clippy (both feature states), gate-verify script (contrib libcomet
+13 MB), cargo fmt -- all green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@schenksj schenksj force-pushed the pr/delta-A3b-rust-executor branch from 7c1773f to 0fab506 Compare June 29, 2026 13:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant