Support INT block scale learning by realAsma · Pull Request #1795 · NVIDIA/Model-Optimizer

realAsma · 2026-06-22T23:11:40Z

What does this PR do?

Type of change: new feature, new tests

Adds minimal INT block scale-learning support for the LAQ investigation branch:

Converts eligible static INT block quantizers to StaticBlockScaleQuantizer after max, mse, or local_hessian scale initialization so LAQ can start from non-max initializers.
Adds a fake dynamic INT block quantization path for weight-only dynamic max scale baselines with integer num_bits and block_sizes: {"type": "dynamic"}.
Extends LAQ CPU unit coverage for INT3/INT2 static block max/mse initialization, frozen/original/dual/tied amax variants, and dynamic max weight-only forward.

Usage

quant_cfg = {
    "quant_cfg": [
        {"quantizer_name": "*", "enable": False},
        {
            "quantizer_name": "*weight_quantizer",
            "enable": True,
            "cfg": {"num_bits": 3, "block_sizes": {-1: 16, "type": "static"}},
        },
    ],
    "algorithm": {
        "method": "laq",
        "learnable_amax": ["pre", "post"],
        "tied_amax": False,
        "scale_algorithm": {"method": "mse"},
    },
}

Dynamic max scale baseline:

quant_cfg = {
    "quant_cfg": [
        {"quantizer_name": "*", "enable": False},
        {
            "quantizer_name": "*weight_quantizer",
            "enable": True,
            "cfg": {"num_bits": 2, "block_sizes": {-1: 16, "type": "dynamic"}},
        },
    ],
    "algorithm": "max",
}

QAD experiment plan

The QAD experiment plan is intentionally kept out of the PR description and
shared separately with the owner for review before any QAD jobs are launched.

Testing

python_pwd -m ruff check modelopt/torch/quantization/tensor_quant.py modelopt/torch/quantization/nn/modules/tensor_quantizer.py tests/unit/torch/quantization/test_laq.py
pytest_pwd tests/unit/torch/quantization/test_laq.py -q
pre-commit run --files modelopt/torch/quantization/tensor_quant.py modelopt/torch/quantization/nn/modules/tensor_quantizer.py tests/unit/torch/quantization/test_laq.py
pytest_pwd tests/unit/torch/quantization/test_laq.py -q after pre-commit

Before your PR is "Ready for review"

Is this change backward compatible?: Yes
If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: N/A
Did you write any new necessary tests?: Yes
Did you update Changelog?: N/A
Did you get Claude approval on this PR?: N/A, draft PR for experiment-plan review

Additional Information

Draft PR targeting the LAQ branch for owner review before launching QAD jobs.

Signed-off-by: realAsma <akuriparambi@nvidia.com>

copy-pr-bot · 2026-06-22T23:11:44Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

coderabbitai · 2026-06-22T23:11:48Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e8549e43-2005-4bee-a860-0d824d1dcd9f

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch asma/laq-int3-int2-scale-learning

_{Comment @coderabbitai help to get the list of available commands.}

realAsma · 2026-06-22T23:31:08Z

Keep tied Dual LSQ as a second-pass reducer if independent Dual LSQ is unstable or too expensive.

what does this mean? Not clear to me.

realAsma · 2026-06-22T23:32:19Z

Train CE loss.
Eval CE loss.
Eval KL/KD loss when teacher logits are available. -> fix QAD uses KL div training loss.

realAsma · 2026-06-22T23:32:52Z

Export compatibility status for static INT block variants.

Ignore this please

realAsma · 2026-06-22T23:43:11Z

🤖 Bot comment.

Addressed the QAD-plan comments in the PR body:

Removed the unclear tied-Dual-LSQ fallback wording from the first grid.
Updated metrics so QAD training KL/KD loss is the primary training objective, with train CE only called out if logged separately.
Removed export compatibility from the reviewed metrics list as requested.

Plan comments addressed: #1795 (comment), #1795 (comment), #1795 (comment)

Signed-off-by: realAsma <akuriparambi@nvidia.com>

Support INT block scale learning

2613ab9

Signed-off-by: realAsma <akuriparambi@nvidia.com>

realAsma commented Jun 22, 2026

View reviewed changes

Comment thread modelopt/torch/quantization/nn/modules/tensor_quantizer.py Outdated

coderabbitai Bot approved these changes Jun 22, 2026

View reviewed changes

Address INT block review feedback

4f44252

Signed-off-by: realAsma <akuriparambi@nvidia.com>

realAsma force-pushed the asma/laq-int3-int2-scale-learning branch from ec72e81 to 4f44252 Compare June 22, 2026 23:43

Route dynamic INT blocks through tensor_quant

1a6fa2d

Signed-off-by: realAsma <akuriparambi@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support INT block scale learning#1795

Support INT block scale learning#1795
realAsma wants to merge 3 commits into
asma/laq-algorithmfrom
asma/laq-int3-int2-scale-learning

realAsma commented Jun 22, 2026 •

edited

Loading

Uh oh!

copy-pr-bot Bot commented Jun 22, 2026

Uh oh!

coderabbitai Bot commented Jun 22, 2026 •

edited

Loading

Review skipped

Uh oh!

realAsma commented Jun 22, 2026

Uh oh!

realAsma commented Jun 22, 2026

Uh oh!

realAsma commented Jun 22, 2026

Uh oh!

Uh oh!

realAsma commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

realAsma commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

QAD experiment plan

Testing

Before your PR is "Ready for review"

Additional Information

Uh oh!

copy-pr-bot Bot commented Jun 22, 2026

Uh oh!

coderabbitai Bot commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

realAsma commented Jun 22, 2026

Uh oh!

realAsma commented Jun 22, 2026

Uh oh!

realAsma commented Jun 22, 2026

Uh oh!

Uh oh!

realAsma commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

realAsma commented Jun 22, 2026 •

edited

Loading

coderabbitai Bot commented Jun 22, 2026 •

edited

Loading