Support INT block scale learning#1795
Conversation
Signed-off-by: realAsma <akuriparambi@nvidia.com>
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
Keep tied Dual LSQ as a second-pass reducer if independent Dual LSQ is unstable or too expensive.
|
|
Train CE loss. |
|
Export compatibility status for static INT block variants.
|
Addressed the QAD-plan comments in the PR body:
Plan comments addressed: #1795 (comment), #1795 (comment), #1795 (comment) |
Signed-off-by: realAsma <akuriparambi@nvidia.com>
ec72e81 to
4f44252
Compare
Signed-off-by: realAsma <akuriparambi@nvidia.com>
What does this PR do?
Type of change: new feature, new tests
Adds minimal INT block scale-learning support for the LAQ investigation branch:
StaticBlockScaleQuantizeraftermax,mse, orlocal_hessianscale initialization so LAQ can start from non-max initializers.num_bitsandblock_sizes: {"type": "dynamic"}.Usage
Dynamic max scale baseline:
QAD experiment plan
The QAD experiment plan is intentionally kept out of the PR description and
shared separately with the owner for review before any QAD jobs are launched.
Testing
python_pwd -m ruff check modelopt/torch/quantization/tensor_quant.py modelopt/torch/quantization/nn/modules/tensor_quantizer.py tests/unit/torch/quantization/test_laq.pypytest_pwd tests/unit/torch/quantization/test_laq.py -qpre-commit run --files modelopt/torch/quantization/tensor_quant.py modelopt/torch/quantization/nn/modules/tensor_quantizer.py tests/unit/torch/quantization/test_laq.pypytest_pwd tests/unit/torch/quantization/test_laq.py -qafter pre-commitBefore your PR is "Ready for review"
CONTRIBUTING.md: N/AAdditional Information
Draft PR targeting the LAQ branch for owner review before launching QAD jobs.