Fix weight-only prequant layernorm export#1825
Conversation
Signed-off-by: weimingc <17592131+meenchen@users.noreply.github.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughAdds a pre-quant-scale presence check to AWQ layernorm fusion in unified HF export and adds a unit test covering the path where that attribute is absent. ChangesAWQ pre-quant-scale gating
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 5 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1825 +/- ##
==========================================
- Coverage 77.36% 74.37% -3.00%
==========================================
Files 513 513
Lines 56891 56894 +3
==========================================
- Hits 44013 42313 -1700
- Misses 12878 14581 +1703
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
What does this PR do?
Type of change: Bug fix
Fixes HF export for INT4 blockwise weight-only checkpoints. The export path previously used the coarse
int4_awqformat label to enter AWQ layernorm pre-quant fusion, even when the recipe had no input pre-quant scale state. The fusion gate now checks that all fused modules actually carry_pre_quant_scalebefore folding layernorm weights.Usage
Testing
/Users/weimingc/miniconda3/envs/modelopt/bin/python -m pytest tests/unit/torch/export/test_unified_export_hf.py -q/Users/weimingc/miniconda3/envs/modelopt/bin/python -m pytest tests/unit/torch/export -qhf_ptq.pywithmeta-llama/Llama-3.1-8B-Instruct,general/ptq/int4_blockwise_weight_only,--calib_size 64, and--skip_generatecompleted export and artifact validation.Before your PR is "Ready for review"
Make sure you read and follow Contributor guidelines and your commits are signed (
git commit -s -S).Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded
trust_remote_code=True,torch.load(..., weights_only=False),pickle, etc.).CONTRIBUTING.md: N/AAdditional Information
The regression test covers the export preprocessing path for the weight-only recipe and verifies layernorm pre-quant fusion is skipped when
_pre_quant_scaleis absent.Summary by CodeRabbit