Skip to content

Fix HF PTQ empty-init dtype kwargs#1853

Closed
realAsma wants to merge 2 commits into
mainfrom
beebot/nvbug-6359821
Closed

Fix HF PTQ empty-init dtype kwargs#1853
realAsma wants to merge 2 commits into
mainfrom
beebot/nvbug-6359821

Conversation

@realAsma

Copy link
Copy Markdown
Contributor

Summary

Fixes NVBug 6359821: hf_ptq.py can fail during the empty-weight device-map probe for remote/custom architectures like DeciLMForCausalLM because the probe forwards dtype into from_config(), and that kwarg can leak to the custom model constructor.

This change removes dtype-related kwargs from the temporary from_config() call and instead sets PyTorch's default dtype only around the empty-weight construction used for infer_auto_device_map.

NVBug: https://nvbugspro.nvidia.com/bug/6359821

Validation

  • pre-commit run --files examples/hf_ptq/example_utils.py tests/examples/hf_ptq/test_example_utils.py
  • pytest_pwd tests/examples/hf_ptq/test_example_utils.py (13 passed)

@realAsma realAsma added the cherry-pick-0.45.0 After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc label Jun 29, 2026
@copy-pr-bot

copy-pr-bot Bot commented Jun 29, 2026

Copy link
Copy Markdown

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 0b41196b-deae-4b57-ab96-79025eb57d27

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch beebot/nvbug-6359821

Comment @coderabbitai help to get the list of available commands.

@codecov

codecov Bot commented Jun 29, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 77.40%. Comparing base (c5e7167) to head (c31753b).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1853   +/-   ##
=======================================
  Coverage   77.40%   77.40%           
=======================================
  Files         515      515           
  Lines       57118    57118           
=======================================
  Hits        44214    44214           
  Misses      12904    12904           
Flag Coverage Δ
unit 54.92% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread examples/hf_ptq/example_utils.py Outdated
Comment thread examples/hf_ptq/example_utils.py Outdated
Comment on lines +619 to +623
def _is_unexpected_dtype_kwarg_error(error):
message = str(error)
return "unexpected keyword argument" in message and (
"'dtype'" in message or '"dtype"' in message
)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need of this helper method

Comment thread examples/hf_ptq/example_utils.py Outdated
Comment on lines +781 to +788
# unless specified by the hf_config.
torch_dtype = getattr(hf_config, "torch_dtype", torch.bfloat16)
model_kwargs2 = model_kwargs.copy()
torch_dtype = _empty_model_init_dtype(hf_config)
model_kwargs2 = _empty_model_init_kwargs(model_kwargs, torch_dtype)
if auto_model_module not in [AutoModelForCausalLM, AutoModel]:
model_kwargs2.pop("trust_remote_code", None)
model_kwargs2["dtype"] = torch_dtype
model_kwargs2.pop("max_memory", None)
model = from_config(hf_config, **model_kwargs2)
model = _from_config_for_empty_weights(
from_config, hf_config, model_kwargs2, torch_dtype
)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

intermixing of low level code and high level code. Why not move all these to _from_config_for_empty_weights

@realAsma realAsma force-pushed the beebot/nvbug-6359821 branch from 6c67f65 to c133b8c Compare June 29, 2026 22:26
Signed-off-by: realAsma <akuriparambi@nvidia.com>
@realAsma

Copy link
Copy Markdown
Contributor Author

Superseded by #1857 from branch asma/nvbug-6359821.

@realAsma realAsma closed this Jun 29, 2026
@github-actions

Copy link
Copy Markdown
Contributor
PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-06-29 22:56 UTC

@kevalmorabia97 kevalmorabia97 removed the cherry-pick-0.45.0 After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc label Jul 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants