Implement get_dummy_lora_warmup_rank#702
Conversation
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
There was a problem hiding this comment.
Code Review
This pull request adds the get_dummy_lora_warmup_rank method to vllm_lora_worker_manager.py to support vLLM v1's dummy LoRA warmup. The review feedback recommends defensively checking for the existence of _adapter_manager before accessing it to avoid potential AttributeError exceptions due to lazy initialization.
| manager_fn = getattr( | ||
| self._adapter_manager, "get_dummy_lora_warmup_rank", None | ||
| ) |
There was a problem hiding this comment.
Directly accessing self._adapter_manager can raise an AttributeError if get_dummy_lora_warmup_rank is called before create_lora_manager has initialized it (since it is lazily initialized and only annotated in __init__). It is safer to use getattr(self, '_adapter_manager', None) to defensively guard against this.
manager = getattr(self, "_adapter_manager", None)
manager_fn = getattr(
manager, "get_dummy_lora_warmup_rank", None
) if manager is not None else None|
@danielhanchen can you take a look at this? I've picked this up from #608 in hopes of getting it merged sooner since its a genuine bug and the original PR has been open for some time. Thanks |
|
Created an issue to track this fix unslothai/unsloth#6114 |
vLLM v1's
LoRAModelRunnerMixin.maybe_setup_dummy_lorascallsself.lora_manager.get_dummy_lora_warmup_rank(...)during KV-cache profiling / cudagraph warmup. Unsloth replaces vLLM's LoRA manager withunsloth_zoo.vllm_lora_worker_manager.LRUCacheWorkerLoRAManager, which does not implement this method. This results in the following error:get_dummy_lora_warmup_rankimplementation picked up from #608