test(workflow-operator): fill uncovered branches in source/scan and projection descriptors#6100
Conversation
…rojection descriptors Extends the existing specs to cover Codecov-missed lines: ProjectionOpDesc.derivePartition (hash/range/passthrough arms), and the getPhysicalOp wiring + schema propagation for TextInputSourceOpDesc, FileScanSourceOpDesc (plus its outputFileName schema branch), and FileScanOpDesc.
Automated Reviewer SuggestionsBased on the
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #6100 +/- ##
============================================
+ Coverage 59.06% 59.22% +0.15%
- Complexity 3203 3209 +6
============================================
Files 1132 1132
Lines 43674 43674
Branches 4734 4734
============================================
+ Hits 25797 25864 +67
+ Misses 16448 16378 -70
- Partials 1429 1432 +3
*This pull request uses carry forward flags. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR extends unit test coverage in common/workflow-operator for several operator descriptors’ pure descriptor logic (physical-op wiring, schema propagation, and projection partition derivation), based on previously uncovered branches.
Changes:
- Add
getPhysicalOpwiring + schema propagation assertions forTextInputSourceOpDesc,FileScanSourceOpDesc, andFileScanOpDesc. - Add
derivePartitionbranch coverage tests forProjectionOpDesc(hash/range/pass-through behaviors). - Add a
FileScanSourceOpDesc.sourceSchematest covering theoutputFileName-enabled branch.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| common/workflow-operator/src/test/scala/org/apache/texera/amber/operator/source/scan/text/TextInputSourceOpDescSpec.scala | Adds tests for getPhysicalOp wiring and output-schema propagation for the text input source descriptor. |
| common/workflow-operator/src/test/scala/org/apache/texera/amber/operator/source/scan/file/FileScanSourceOpDescSpec.scala | Adds tests for getPhysicalOp wiring + schema propagation, and sourceSchema behavior when outputFileName is enabled. |
| common/workflow-operator/src/test/scala/org/apache/texera/amber/operator/source/scan/file/FileScanOpDescSpec.scala | Adds tests for getPhysicalOp wiring and output-schema propagation for the file-scan (non-source) descriptor. |
| common/workflow-operator/src/test/scala/org/apache/texera/amber/operator/projection/ProjectionOpDescSpec.scala | Adds tests to cover derivePartition behavior for hash/range/other partition types. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
| config | throughput | MB/s | latency | max Δ latest / 7d | |
|---|---|---|---|---|---|
| 🔴 | bs=10 sw=10 sl=64 | 407 | 0.248 | 23,032/34,574/34,574 us | 🔴 +7.8% / 🔴 +125.5% |
| 🟢 | bs=100 sw=10 sl=64 | 824 | 0.503 | 121,858/139,711/139,711 us | 🟢 -11.4% / 🔴 +29.0% |
| ⚪ | bs=1000 sw=10 sl=64 | 959 | 0.585 | 1,040,787/1,067,659/1,067,659 us | ⚪ within ±5% / 🔴 -5.4% |
Baseline details
Latest main 4c9d30a from same runner
| config | metric | PR | latest main | 7d avg | Δ latest | Δ 7d |
|---|---|---|---|---|---|---|
| bs=10 sw=10 sl=64 | throughput | 407 tuples/sec | 431 tuples/sec | 772.08 tuples/sec | -5.6% | -47.3% |
| bs=10 sw=10 sl=64 | MB/s | 0.248 MB/s | 0.263 MB/s | 0.471 MB/s | -5.7% | -47.4% |
| bs=10 sw=10 sl=64 | p50 | 23,032 us | 24,560 us | 12,745 us | -6.2% | +80.7% |
| bs=10 sw=10 sl=64 | p95 | 34,574 us | 32,062 us | 15,330 us | +7.8% | +125.5% |
| bs=10 sw=10 sl=64 | p99 | 34,574 us | 32,062 us | 19,054 us | +7.8% | +81.5% |
| bs=100 sw=10 sl=64 | throughput | 824 tuples/sec | 815 tuples/sec | 982.64 tuples/sec | +1.1% | -16.1% |
| bs=100 sw=10 sl=64 | MB/s | 0.503 MB/s | 0.497 MB/s | 0.6 MB/s | +1.2% | -16.1% |
| bs=100 sw=10 sl=64 | p50 | 121,858 us | 117,323 us | 101,961 us | +3.9% | +19.5% |
| bs=100 sw=10 sl=64 | p95 | 139,711 us | 157,769 us | 108,335 us | -11.4% | +29.0% |
| bs=100 sw=10 sl=64 | p99 | 139,711 us | 157,769 us | 114,379 us | -11.4% | +22.1% |
| bs=1000 sw=10 sl=64 | throughput | 959 tuples/sec | 956 tuples/sec | 1,013 tuples/sec | +0.3% | -5.3% |
| bs=1000 sw=10 sl=64 | MB/s | 0.585 MB/s | 0.584 MB/s | 0.618 MB/s | +0.2% | -5.4% |
| bs=1000 sw=10 sl=64 | p50 | 1,040,787 us | 1,048,191 us | 993,573 us | -0.7% | +4.8% |
| bs=1000 sw=10 sl=64 | p95 | 1,067,659 us | 1,097,334 us | 1,032,489 us | -2.7% | +3.4% |
| bs=1000 sw=10 sl=64 | p99 | 1,067,659 us | 1,097,334 us | 1,065,526 us | -2.7% | +0.2% |
Raw CSV
config_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,491.30,200,128000,407,0.248,23032.44,34574.00,34574.00
1,100,10,64,20,2426.72,2000,1280000,824,0.503,121857.84,139711.18,139711.18
2,1000,10,64,20,20849.65,20000,12800000,959,0.585,1040787.28,1067658.84,1067658.84…erivePartition specs - assert the wired exec class via classOf[...].getName instead of a hardcoded FQN string (resilient to refactors) in the TextInput/FileScanSource/FileScan descriptor specs - add an empty-RangePartition case to ProjectionOpDesc.derivePartition
What changes were proposed in this PR?
Fill uncovered branches in four source/scan and projection descriptors, selected from the Codecov report. No production-code changes — extends the existing specs.
ProjectionOpDesc.scaladerivePartition— the hash (kept/empty→unknown), range, and pass-through armsTextInputSourceOpDesc.scalagetPhysicalOpwiring (exec class, ports) + schema propagationFileScanSourceOpDesc.scalagetPhysicalOpwiring + propagation, and theoutputFileNamefilename-column branch ofsourceSchemaFileScanOpDesc.scalagetPhysicalOpwiring (input/output ports) + schema propagationAll targets exercise pure in-memory descriptor logic (
getPhysicalOp/derivePartition/sourceSchema); the existing specs only drove the executors orsourceSchemadefaults.Any related issues, documentation, discussions?
Follow-up to the review feedback on #6043: prioritize tests that fill uncovered code paths.
How was this PR tested?
sbt "WorkflowOperator/testOnly *ProjectionOpDescSpec *TextInputSourceOpDescSpec *FileScanSourceOpDescSpec *FileScanOpDescSpec"— 37 tests, all greensbt "WorkflowOperator/Test/scalafmtCheck"andsbt "WorkflowOperator/scalafixAll --check"— cleanWas this PR authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Opus 4.8 [1M context])