feat: contrib Delta serde + native exec — end-to-end native reads [Delta contrib split, part 4b]#8
Draft
schenksj wants to merge 2 commits into
Draft
Conversation
schenksj
added a commit
that referenced
this pull request
Jun 21, 2026
…ed (end-to-end native reads, fork #8)
…lta contrib split, part 4b] Part 4b of the Delta Lake contrib PR breakup (tracking: apache#4366). The red-to-green moment: a `-Pcontrib-delta` build now does END-TO-END native Delta reads. `CometExecRule`'s scanHandler lookup (wired in part 2) now resolves -- the serde converts the `CometDeltaScanMarker` (planted by part 4a's DeltaScanRule) into a `CometDeltaNativeScanExec` that reads through delta-kernel-rs (parts 3a/3b). - `CometDeltaNativeScan.scala` — the serde: marker -> native scan operator (schema annotation, column mapping, row tracking, partition handling). CDF conversion is deferred to part 5 (the `convertCdf` path is carved out here to avoid a compile dependency on `CometDeltaCdfScanExec`). `ScanImpl` is not redefined — part 4a moved it to `DeltaScanMetadata`. - `CometDeltaNativeScanExec.scala` — the exec (`CometScanWithPlanData`): synthesises file partitions from kernel scan tasks, applies DPP pruning. Interim error semantics (until part 8): the `perPartitionFilePaths` / `FAILED_READ_FILE` plumbing is omitted, so a Delta read failure surfaces as a generic `CometNativeException` (the `CometExecRDD` param defaults to empty). - `Native.scala` — JNI declarations binding the part-3a Rust entry points. - `DeltaPlanDataInjector.scala` — registers under `OpStruct::DELTA_SCAN`; part 1's reflective registry picks it up, so per-partition Delta data is injected at execution. No core / earlier-unit edits — the reflective wiring already reaches the serde + injector the moment these classes land. Tests (gated, end-to-end native reads): CometDeltaNativeSuite (19), CometDeltaColumnMappingSuite (5), CometDeltaFeaturesSuite (8), CometDeltaCoverageSuite (24), CometDeltaColumnMappingPhysicalNameReproSuite (1) — all pass. CometDeltaTestBase re-gains the native-read helpers (kept part 4a's marker helpers that are still used). CometDeltaMarkerSuite updated: with the serde present, a claimed scan now engages `CometDeltaNativeScanExec` (it no longer leaves the marker in the plan), so its assertions moved from marker-presence to native engagement. Verification: gated JVM test-compile, 60 contrib tests across 6 suites, spotless/scalastyle, check-suites, gate-verify (default build still 0 Delta symbols) — all green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…lision guard [apache#30 + themeA, folded into A.4b]
2dc47b1 to
dd9a0d8
Compare
8d14cdb to
0ee301f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this part is
The serde + native exec — the moment a
-Pcontrib-deltabuild does end-to-end native Delta reads.CometExecRule's scanHandler lookup (wired in part 2) now resolves: the serde converts theCometDeltaScanMarker(planted by part 4a'sDeltaScanRule) into aCometDeltaNativeScanExecthat reads through delta-kernel-rs (parts 3a/3b).CometDeltaNativeScan— serde (marker → native scan operator: schema annotation, column mapping, row tracking, partitioning). The CDF path (convertCdf) is deferred to part 5.CometDeltaNativeScanExec— the exec (CometScanWithPlanData): file partitions from kernel tasks + DPP pruning. Interim error semantics until part 8 (a read failure surfaces as a genericCometNativeException; theFAILED_READ_FILEpath lands later).Native.scala— JNI declarations binding the part-3a Rust entry points.DeltaPlanDataInjector— registers underOpStruct::DELTA_SCAN; part 1's reflective registry picks it up for per-partition injection.Why it's contained
No core / earlier-unit edits — the reflective wiring already reaches the serde + injector the moment these classes land. A.4b touches only
contrib/delta/src. Default builds still carry zero Delta surface (gate-verify confirms 0 Delta symbols).Verification
gated JVM test-compile, 60 contrib tests across 6 suites (
CometDeltaNativeSuite19, ColumnMapping 5, Features 8, Coverage 24, PhysicalNameRepro 1, Marker 3) — end-to-end native reads — spotless/scalastyle, check-suites, and the gate-verify script — all green.🤖 AI disclosure: this PR was prepared with assistance from Claude Code (Claude Opus 4.8), under the submitter's review and direction.