Skip to content

feat: contrib Delta serde + native exec — end-to-end native reads [Delta contrib split, part 4b]#8

Draft
schenksj wants to merge 2 commits into
pr/delta-A4a-scala-claimfrom
pr/delta-A4b-scala-exec
Draft

feat: contrib Delta serde + native exec — end-to-end native reads [Delta contrib split, part 4b]#8
schenksj wants to merge 2 commits into
pr/delta-A4a-scala-claimfrom
pr/delta-A4b-scala-exec

Conversation

@schenksj

Copy link
Copy Markdown
Owner

Fork-local review draft (Delta-contrib PR split, part 4b). Base is pr/delta-A4a-scala-claim so the diff shows only A.4b. The red-to-green unit; stacks on parts 1 (apache#4700), 2, 3a, 3b, 4a. Tracking umbrella: apache#4366.

What this part is

The serde + native exec — the moment a -Pcontrib-delta build does end-to-end native Delta reads. CometExecRule's scanHandler lookup (wired in part 2) now resolves: the serde converts the CometDeltaScanMarker (planted by part 4a's DeltaScanRule) into a CometDeltaNativeScanExec that reads through delta-kernel-rs (parts 3a/3b).

  • CometDeltaNativeScan — serde (marker → native scan operator: schema annotation, column mapping, row tracking, partitioning). The CDF path (convertCdf) is deferred to part 5.
  • CometDeltaNativeScanExec — the exec (CometScanWithPlanData): file partitions from kernel tasks + DPP pruning. Interim error semantics until part 8 (a read failure surfaces as a generic CometNativeException; the FAILED_READ_FILE path lands later).
  • Native.scala — JNI declarations binding the part-3a Rust entry points.
  • DeltaPlanDataInjector — registers under OpStruct::DELTA_SCAN; part 1's reflective registry picks it up for per-partition injection.

Why it's contained

No core / earlier-unit edits — the reflective wiring already reaches the serde + injector the moment these classes land. A.4b touches only contrib/delta/src. Default builds still carry zero Delta surface (gate-verify confirms 0 Delta symbols).

Verification

gated JVM test-compile, 60 contrib tests across 6 suites (CometDeltaNativeSuite 19, ColumnMapping 5, Features 8, Coverage 24, PhysicalNameRepro 1, Marker 3) — end-to-end native reads — spotless/scalastyle, check-suites, and the gate-verify script — all green.


🤖 AI disclosure: this PR was prepared with assistance from Claude Code (Claude Opus 4.8), under the submitter's review and direction.

schenksj and others added 2 commits June 29, 2026 09:23
…lta contrib split, part 4b]

Part 4b of the Delta Lake contrib PR breakup (tracking: apache#4366). The red-to-green moment: a
`-Pcontrib-delta` build now does END-TO-END native Delta reads. `CometExecRule`'s scanHandler
lookup (wired in part 2) now resolves -- the serde converts the `CometDeltaScanMarker` (planted
by part 4a's DeltaScanRule) into a `CometDeltaNativeScanExec` that reads through delta-kernel-rs
(parts 3a/3b).

- `CometDeltaNativeScan.scala` — the serde: marker -> native scan operator (schema annotation,
  column mapping, row tracking, partition handling). CDF conversion is deferred to part 5 (the
  `convertCdf` path is carved out here to avoid a compile dependency on `CometDeltaCdfScanExec`).
  `ScanImpl` is not redefined — part 4a moved it to `DeltaScanMetadata`.
- `CometDeltaNativeScanExec.scala` — the exec (`CometScanWithPlanData`): synthesises file
  partitions from kernel scan tasks, applies DPP pruning. Interim error semantics (until part 8):
  the `perPartitionFilePaths` / `FAILED_READ_FILE` plumbing is omitted, so a Delta read failure
  surfaces as a generic `CometNativeException` (the `CometExecRDD` param defaults to empty).
- `Native.scala` — JNI declarations binding the part-3a Rust entry points.
- `DeltaPlanDataInjector.scala` — registers under `OpStruct::DELTA_SCAN`; part 1's reflective
  registry picks it up, so per-partition Delta data is injected at execution.

No core / earlier-unit edits — the reflective wiring already reaches the serde + injector the
moment these classes land.

Tests (gated, end-to-end native reads): CometDeltaNativeSuite (19), CometDeltaColumnMappingSuite (5),
CometDeltaFeaturesSuite (8), CometDeltaCoverageSuite (24), CometDeltaColumnMappingPhysicalNameReproSuite
(1) — all pass. CometDeltaTestBase re-gains the native-read helpers (kept part 4a's marker helpers
that are still used). CometDeltaMarkerSuite updated: with the serde present, a claimed scan now
engages `CometDeltaNativeScanExec` (it no longer leaves the marker in the plan), so its assertions
moved from marker-presence to native engagement.

Verification: gated JVM test-compile, 60 contrib tests across 6 suites, spotless/scalastyle,
check-suites, gate-verify (default build still 0 Delta symbols) — all green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@schenksj schenksj force-pushed the pr/delta-A4a-scala-claim branch from 2dc47b1 to dd9a0d8 Compare June 29, 2026 13:29
@schenksj schenksj force-pushed the pr/delta-A4b-scala-exec branch from 8d14cdb to 0ee301f Compare June 29, 2026 13:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant