impr: Radiance Cascades v2#2617
Conversation
… the new system, add cool factor
# Conflicts: # apps/typegpu-docs/package.json # apps/typegpu-docs/src/examples/rendering/radiance-cascades-drawing/drawInteraction.ts # apps/typegpu-docs/src/examples/rendering/radiance-cascades-drawing/index.ts # apps/typegpu-docs/src/examples/rendering/radiance-cascades/drag-controller.ts # apps/typegpu-docs/src/examples/rendering/radiance-cascades/index.ts # apps/typegpu-docs/tests/individual-example-tests/jump-flood-distance.test.ts # packages/typegpu-radiance-cascades/README.md # packages/typegpu-radiance-cascades/package.json # packages/typegpu-radiance-cascades/src/cascades.ts # packages/typegpu-radiance-cascades/src/index.ts # packages/typegpu-radiance-cascades/src/runner.ts # packages/typegpu-sdf/src/jumpFlood.ts # packages/typegpu/src/tgsl/accessProp.ts # pnpm-lock.yaml
|
pkg.pr.new packages benchmark commit |
📊 Bundle Size Comparison
👀 Notable resultsStatic test results:No major changes. Dynamic test results:
📋 All resultsClick to reveal the results table (356 entries).
If you wish to run a comparison for other, slower bundlers, run the 'Tree-shake test' from the GitHub Actions menu. |
Resolution Time Benchmark---
config:
themeVariables:
xyChart:
plotColorPalette: "#E63946, #3B82F6, #059669"
---
xychart
title "Random Branching (🔴 PR | 🔵 main | 🟢 release)"
x-axis "max depth" [1, 2, 3, 4, 5, 6, 7, 8]
y-axis "time (ms)"
line [0.91, 1.92, 4.31, 6.53, 8.03, 10.58, 21.46, 27.08]
line [0.92, 1.90, 4.09, 6.81, 7.13, 11.89, 20.95, 23.34]
line [0.95, 1.91, 4.20, 6.23, 7.46, 11.69, 22.26, 25.27]
---
config:
themeVariables:
xyChart:
plotColorPalette: "#E63946, #3B82F6, #059669"
---
xychart
title "Linear Recursion (🔴 PR | 🔵 main | 🟢 release)"
x-axis "max depth" [1, 2, 3, 4, 5, 6, 7, 8]
y-axis "time (ms)"
line [0.30, 0.52, 0.67, 0.85, 1.13, 1.21, 1.46, 1.55]
line [0.29, 0.55, 0.65, 0.81, 1.19, 1.23, 1.43, 1.57]
line [0.30, 0.75, 0.73, 0.91, 1.18, 1.27, 1.55, 1.63]
---
config:
themeVariables:
xyChart:
plotColorPalette: "#E63946, #3B82F6, #059669"
---
xychart
title "Full Tree (🔴 PR | 🔵 main | 🟢 release)"
x-axis "max depth" [1, 2, 3, 4, 5, 6, 7, 8]
y-axis "time (ms)"
line [0.84, 2.06, 3.70, 6.34, 12.26, 25.85, 55.90, 113.41]
line [0.83, 1.96, 3.93, 6.10, 12.11, 25.61, 57.11, 112.70]
line [0.90, 2.06, 3.67, 6.17, 12.56, 25.76, 56.70, 113.83]
|
|
@pullfrog pls review |
There was a problem hiding this comment.
✅ No new issues found.
Reviewed changes — v2 rewrite of the radiance cascades package with better cascade sizing, Morton ray ordering, configurable merge modes, encoder support, and memory optimization.
- Add
getCascadeInfowith per-layer metadata — replaces the flatgetCascadeDimwith a richCascadeInfostructure carrying per-layer probe counts, UV ranges, and valid dimensions. The oldgetCascadeDimis kept as a convenience wrapper. - Morton Z-order ray ordering — replaces linear ray indexing with Morton encoding for more uniform angular sampling.
part1By1andmorton2Dare defined incascades.tsand used inmakeCascadePassCompute. - Aspect-correct ray directions —
rayDirectionandsegmentMetricLengthuserenderAspectSlotto produce correct ray distributions on non-square render targets.rayBoxExitUvclips rays to the unit box. - Configurable cascade layers — support for
baseStoredRayDim(1/2/4),mergeMode(hardwareorbilinear-fix), per-layer uniform buffers (CascadeLayerParams), and exposure of common slots (maxRayStepsSlot,rayMarchStepSafetySlot,traceSegmentSlot). - Encoder support —
run()accepts an optionalGPUCommandEncoderso cascade and jump-flood passes can be batched with surrounding work. When omitted,run()creates and submits its own encoder. - 2D ping-pong cascade textures by default — reduces memory usage significantly. 3D array textures are used only when
keepCascadeLayers: true. - Prebuilt JFA pipelines —
createJumpFloodnow prebuilds one pipeline per JFA step instead of writingoffsetUniformat runtime. Fixesdestroy()to clean up offset buffers and uniforms.
DeepSeek Pro (free via Pullfrog for OSS) | 𝕏
|
@pullfrog could you estimate how much less gpu memory we allocate for a couple given resolutions? |
|
The biggest memory win comes from the cascade ping-pong textures switching from 3D arrays to 2D (the default Cascade texture memory: old vs new (default
|
| Resolution | Cascade Dim | Old Layers | Old Memory (2×3D) | New Memory (2×2D) | Savings |
|---|---|---|---|---|---|
| 4K (2160p) | 8192×4608 | 8 | 4.50 GB | 576 MB | 3.94 GB |
| 1440p | 4096×2304 | 7 | 1008 MB | 144 MB | 864 MB |
| 1080p | 4096×2304 | 7 | 1008 MB | 144 MB | 864 MB |
| 720p | 2048×1152 | 7→6 | 252 MB | 36 MB | 216 MB |
| 1024×768 | 2048×1536 | 7→6 | 336 MB | 48 MB | 288 MB |
| 800×600 | 1024×768 | 6 | 72 MB | 12 MB | 60 MB |
(1080p and 1440p share the same cascade dim because the closest-power-of-two diagonal rounds to the same value.)
Tuning baseStoredRayDim (new path, 2×2D only)
Lower values reduce cascade dimensions further at the cost of ray angular resolution:
| Resolution | baseStoredRayDim=1 |
baseStoredRayDim=2 (default) |
baseStoredRayDim=4 |
|---|---|---|---|
| 4K | 144 MB | 576 MB | 2.25 GB |
| 1080p / 1440p | 36 MB | 144 MB | 576 MB |
| 720p | 9 MB | 36 MB | 144 MB |
Uniform buffers
A negligible per-layer uniform buffer (CascadeLayerParams, ~48 B each) replaces the old single CascadeStaticParams + layerBuffer — about the same total, just split per-layer so no layerBuffer.write() is needed during dispatch.
Task list (4/4 completed)
- Explore codebase to understand GPU memory allocation patterns
- Find resolution-dependent buffer/texture allocations
- Calculate and estimate memory savings for given resolutions
- Report findings via pullfrog_report_progress
DeepSeek Pro (free via Pullfrog for OSS) | 𝕏

Uh oh!
There was an error while loading. Please reload this page.