Skip to content

Optimize WAN pipelines: Reduce host overhead, hoist transposes, and fix Cache bugs#432

Draft
Perseus14 wants to merge 1 commit into
mainfrom
wan-cleanup
Draft

Optimize WAN pipelines: Reduce host overhead, hoist transposes, and fix Cache bugs#432
Perseus14 wants to merge 1 commit into
mainfrom
wan-cleanup

Conversation

@Perseus14

@Perseus14 Perseus14 commented Jun 29, 2026

Copy link
Copy Markdown
Collaborator

Overview

This PR introduces several performance optimizations and bug fixes to the Wan 2.1 and 2.2 pipelines (both T2V and I2V), significantly reducing host overhead during the denoising loop and ensuring stability under data-parallel configurations.

Key Optimizations

  • Hoisted I2V Tensor Operations: Pre-transposed and concatenated the condition tensors outside of the main denoising loop in the I2V pipelines (wan_pipeline_i2v_2p1.py, wan_pipeline_i2v_2p2.py) to eliminate redundant jnp.transpose and jnp.concatenate churn on every inference step.
  • Hoisted Timesteps: Hoisted timestep calculations outside of the loop where possible.

Bug Fixes & Cleanups

  • CFG Cache + Ulysses Attention Crash Fix: Added dynamic checks in both Wan 2.1 and Wan 2.2 pipelines to disable use_cfg_cache if batch_size % data_shards != 0. Previously, cfg_cache evaluated only the unconditional branch (dropping the batch size to 1), which caused a ValueError in ulysses_attention's shard_map if the user configured data_shards > 1.
  • Logging Standardization: Replaced raw print() statements with max_logging.log() across the pipelines for cleaner distributed logging.
  • Formatting: Ran ruff and pyink to clean up linter warnings and unused imports.

@github-actions

Copy link
Copy Markdown

@Perseus14 Perseus14 self-assigned this Jun 29, 2026
@Perseus14 Perseus14 force-pushed the wan-cleanup branch 5 times, most recently from 304ef1a to da2175c Compare June 29, 2026 12:22
…ncat/transpose churn reduction, replace prints with max_logging, and fix CFG cache multi-host sharding crashes
@Perseus14 Perseus14 changed the title Optimize WAN pipeline: hoist timesteps, batched text encoding, SenCac… Optimize WAN pipelines: Reduce host overhead, hoist transposes, and fix Cache bugs Jun 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant