Fix HF PTQ empty-init dtype kwargs by realAsma · Pull Request #1853 · NVIDIA/Model-Optimizer

realAsma · 2026-06-29T16:31:48Z

Summary

Fixes NVBug 6359821: hf_ptq.py can fail during the empty-weight device-map probe for remote/custom architectures like DeciLMForCausalLM because the probe forwards dtype into from_config(), and that kwarg can leak to the custom model constructor.

This change removes dtype-related kwargs from the temporary from_config() call and instead sets PyTorch's default dtype only around the empty-weight construction used for infer_auto_device_map.

NVBug: https://nvbugspro.nvidia.com/bug/6359821

Validation

pre-commit run --files examples/hf_ptq/example_utils.py tests/examples/hf_ptq/test_example_utils.py
pytest_pwd tests/examples/hf_ptq/test_example_utils.py (13 passed)

copy-pr-bot · 2026-06-29T16:31:51Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

coderabbitai · 2026-06-29T16:36:42Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 0b41196b-deae-4b57-ab96-79025eb57d27

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch beebot/nvbug-6359821

_{Comment @coderabbitai help to get the list of available commands.}

codecov · 2026-06-29T16:41:39Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 77.40%. Comparing base (c5e7167) to head (c31753b).

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #1853   +/-   ##
=======================================
  Coverage   77.40%   77.40%           
=======================================
  Files         515      515           
  Lines       57118    57118           
=======================================
  Hits        44214    44214           
  Misses      12904    12904

Flag	Coverage Δ
unit	`54.92% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

realAsma · 2026-06-29T21:34:09Z

+def _is_unexpected_dtype_kwarg_error(error):
+    message = str(error)
+    return "unexpected keyword argument" in message and (
+        "'dtype'" in message or '"dtype"' in message
+    )


no need of this helper method

realAsma · 2026-06-29T21:36:30Z

            # unless specified by the hf_config.
-            torch_dtype = getattr(hf_config, "torch_dtype", torch.bfloat16)
-            model_kwargs2 = model_kwargs.copy()
+            torch_dtype = _empty_model_init_dtype(hf_config)
+            model_kwargs2 = _empty_model_init_kwargs(model_kwargs, torch_dtype)
            if auto_model_module not in [AutoModelForCausalLM, AutoModel]:
                model_kwargs2.pop("trust_remote_code", None)
-            model_kwargs2["dtype"] = torch_dtype
-            model_kwargs2.pop("max_memory", None)
-            model = from_config(hf_config, **model_kwargs2)
+            model = _from_config_for_empty_weights(
+                from_config, hf_config, model_kwargs2, torch_dtype
+            )


intermixing of low level code and high level code. Why not move all these to _from_config_for_empty_weights

Signed-off-by: realAsma <akuriparambi@nvidia.com>

realAsma · 2026-06-29T22:55:56Z

Superseded by #1857 from branch asma/nvbug-6359821.

github-actions · 2026-06-29T22:56:22Z

PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-06-29 22:56 UTC

realAsma added the cherry-pick-0.45.0 After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc label Jun 29, 2026

realAsma commented Jun 29, 2026

View reviewed changes

Comment thread examples/hf_ptq/example_utils.py Outdated

coderabbitai Bot approved these changes Jun 29, 2026

View reviewed changes

realAsma commented Jun 29, 2026

View reviewed changes

Fix HF PTQ empty-init dtype fallback

c133b8c

realAsma force-pushed the beebot/nvbug-6359821 branch from 6c67f65 to c133b8c Compare June 29, 2026 22:26

Simplify HF PTQ empty init dtype fix

7c94629

Signed-off-by: realAsma <akuriparambi@nvidia.com>

realAsma closed this Jun 29, 2026

kevalmorabia97 removed the cherry-pick-0.45.0 After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc label Jul 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix HF PTQ empty-init dtype kwargs#1853

Fix HF PTQ empty-init dtype kwargs#1853
realAsma wants to merge 2 commits into
mainfrom
beebot/nvbug-6359821

realAsma commented Jun 29, 2026

Uh oh!

copy-pr-bot Bot commented Jun 29, 2026

Uh oh!

coderabbitai Bot commented Jun 29, 2026 •

edited

Loading

Review skipped

Uh oh!

codecov Bot commented Jun 29, 2026 •

edited

Loading

Uh oh!

Uh oh!

realAsma Jun 29, 2026

Uh oh!

realAsma Jun 29, 2026

Uh oh!

realAsma commented Jun 29, 2026

Uh oh!

github-actions Bot commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

realAsma commented Jun 29, 2026

Summary

Validation

Uh oh!

copy-pr-bot Bot commented Jun 29, 2026

Uh oh!

coderabbitai Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

codecov Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

realAsma Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

realAsma Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

realAsma commented Jun 29, 2026

Uh oh!

github-actions Bot commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai Bot commented Jun 29, 2026 •

edited

Loading

codecov Bot commented Jun 29, 2026 •

edited

Loading