Skip to content

fix(Qwen3-VL-8B): replace deprecated tokenizer arg in SFTTrainer with processing_class and set eos_token in SFTConfig#253

Open
Saad1926Q wants to merge 2 commits into
unslothai:mainfrom
Saad1926Q:fix/qwen3-vl-sft-trainer
Open

fix(Qwen3-VL-8B): replace deprecated tokenizer arg in SFTTrainer with processing_class and set eos_token in SFTConfig#253
Saad1926Q wants to merge 2 commits into
unslothai:mainfrom
Saad1926Q:fix/qwen3-vl-sft-trainer

Conversation

@Saad1926Q

Copy link
Copy Markdown

Context

While using the Qwen3-VL (8B) Vision notebook to fine-tune Qwen3-VL(8B) on ChartQA, I ran into two issues. I figured I'd
contribute the fixes back.

Reference notebook: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Qwen3_VL_(8B)-Vision.ipynb

Issues Fixed

1. Deprecated tokenizer argument in SFTTrainer

tokenizer = tokenizer is deprecated in newer versions of TRL. The correct argument is processing_class = tokenizer, which properly handles vision processors like Qwen3VLProcessor.

2. Missing eos_token in SFTConfig

Without explicitly setting eos_token, TRL falls back to a placeholder <EOS_TOKEN> which doesn't exist in Qwen3VLProcessor's vocabulary, causing this error at runtime:

ValueError: The specified eos_token ('<EOS_TOKEN>') is not found in the vocabulary of the given processing_class (Qwen3VLProcessor). Ensure that the eos_token exists in the vocabulary before using it as an EOS token.

Fix: pass eos_token = tokenizer.tokenizer.eos_token in SFTConfig to use the inner tokenizer's EOS token.

Changes

  • original_template/Qwen3_VL_(8B)-Vision.ipynb - source fix
  • Regenerated nb/ and python_scripts/ via update_all_notebooks.py

Test Plan

  • Ran the notebook end-to-end with 4-bit quantization - SFTTrainer initializes and trains without error

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the SFTTrainer configuration across several notebooks and Python scripts by renaming the tokenizer parameter to processing_class and explicitly defining the eos_token in SFTConfig. A significant concern was raised regarding the changes in scripts/model_created_at.csv, where numerous entries have been zeroed out and marked with an error status, suggesting a potential regression or data loss that should be investigated.

Comment thread scripts/model_created_at.csv Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3c957df738

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread scripts/model_created_at.csv Outdated
@Saad1926Q

Copy link
Copy Markdown
Author

@codex review

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. 🎉

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant