unslothai · danielhanchen · Jan 4, 2026 · Jan 8, 2026 · Jan 9, 2026 · Jan 9, 2026
diff --git a/nb/Advanced_Llama3_1_(3B)_GRPO_LoRA.ipynb b/nb/Advanced_Llama3_1_(3B)_GRPO_LoRA.ipynb
@@ -8,10 +8,10 @@
     "<div class=\"align-center\">\n",
     "<a href=\"https://unsloth.ai/\"><img src=\"https://github.com/unslothai/unsloth/raw/main/images/unsloth%20new%20logo.png\" width=\"115\"></a>\n",
     "<a href=\"https://discord.gg/unsloth\"><img src=\"https://github.com/unslothai/unsloth/raw/main/images/Discord button.png\" width=\"145\"></a>\n",
-    "<a href=\"https://docs.unsloth.ai/\"><img src=\"https://github.com/unslothai/unsloth/blob/main/images/documentation%20green%20button.png?raw=true\" width=\"125\"></a></a> Join Discord if you need help + ⭐ <i>Star us on <a href=\"https://github.com/unslothai/unsloth\">Github</a> </i> ⭐\n",
+    "<a href=\"https://unsloth.ai/docs/\"><img src=\"https://github.com/unslothai/unsloth/blob/main/images/documentation%20green%20button.png?raw=true\" width=\"125\"></a> Join Discord if you need help + ⭐ <i>Star us on <a href=\"https://github.com/unslothai/unsloth\">Github</a> </i> ⭐\n",
     "</div>\n",
     "\n",
-    "To install Unsloth on your own computer, follow the installation instructions on our Github page [here](https://docs.unsloth.ai/get-started/installing-+-updating).\n",
+    "To install Unsloth on your own computer, follow the installation instructions on our Github page [here](https://unsloth.ai/docs/get-started/installing-+-updating).\n",
     "\n",
     "You will learn how to do [data prep](#Data), how to [train](#Train), how to [run the model](#Inference), & [how to save it](#Save)\n"
    ]
@@ -32,13 +32,13 @@
    },
    "source": [
     "\n",
-    "Introducing Unsloth [Standby for RL](https://docs.unsloth.ai/basics/memory-efficient-rl): GRPO is now faster, uses 30% less memory with 2x longer context.\n",
+    "Introducing Unsloth [Standby for RL](https://unsloth.ai/docs/basics/memory-efficient-rl): GRPO is now faster, uses 30% less memory with 2x longer context.\n",
     "\n",
-    "Gpt-oss fine-tuning now supports 8× longer context with 0 accuracy loss. [Read more](https://docs.unsloth.ai/basics/long-context-gpt-oss-training)\n",
+    "Gpt-oss fine-tuning now supports 8× longer context with 0 accuracy loss. [Read more](https://unsloth.ai/docs/basics/long-context-gpt-oss-training)\n",
     "\n",
-    "Unsloth now supports Text-to-Speech (TTS) models. Read our [guide here](https://docs.unsloth.ai/basics/text-to-speech-tts-fine-tuning).\n",
+    "Unsloth now supports Text-to-Speech (TTS) models. Read our [guide here](https://unsloth.ai/docs/basics/text-to-speech-tts-fine-tuning).\n",
     "\n",
-    "Visit our docs for all our [model uploads](https://docs.unsloth.ai/get-started/all-our-models) and [notebooks](https://docs.unsloth.ai/get-started/unsloth-notebooks).\n"
+    "Visit our docs for all our [model uploads](https://unsloth.ai/docs/get-started/all-our-models) and [notebooks](https://unsloth.ai/docs/get-started/unsloth-notebooks).\n"
    ]
   },
   {
@@ -9703,20 +9703,20 @@
    "outputs": [],
    "source": [
     "# Merge to 16bit\n",
-    "if False: model.save_pretrained_merged(\"model\", tokenizer, save_method = \"merged_16bit\",)\n",
-    "if False: model.push_to_hub_merged(\"hf/model\", tokenizer, save_method = \"merged_16bit\", token = \"\")\n",
+    "if False: model.save_pretrained_merged(\"advanced_llama_finetune_16bit\", tokenizer, save_method = \"merged_16bit\",)\n",
+    "if False: model.push_to_hub_merged(\"HF_USERNAME/advanced_llama_finetune_16bit\", tokenizer, save_method = \"merged_16bit\", token = \"YOUR_HF_TOKEN\")\n",
     "\n",
     "# Merge to 4bit\n",
-    "if False: model.save_pretrained_merged(\"model\", tokenizer, save_method = \"merged_4bit\",)\n",
-    "if False: model.push_to_hub_merged(\"hf/model\", tokenizer, save_method = \"merged_4bit\", token = \"\")\n",
+    "if False: model.save_pretrained_merged(\"advanced_llama_finetune_4bit\", tokenizer, save_method = \"merged_4bit\",)\n",
+    "if False: model.push_to_hub_merged(\"HF_USERNAME/advanced_llama_finetune_4bit\", tokenizer, save_method = \"merged_4bit\", token = \"YOUR_HF_TOKEN\")\n",
     "\n",
     "# Just LoRA adapters\n",
     "if False:\n",
-    "    model.save_pretrained(\"model\")\n",
-    "    tokenizer.save_pretrained(\"model\")\n",
+    "    model.save_pretrained(\"advanced_llama_lora\")\n",
+    "    tokenizer.save_pretrained(\"advanced_llama_lora\")\n",
     "if False:\n",
-    "    model.push_to_hub(\"hf/model\", token = \"\")\n",
-    "    tokenizer.push_to_hub(\"hf/model\", token = \"\")\n"
+    "    model.push_to_hub(\"HF_USERNAME/advanced_llama_lora\", token = \"YOUR_HF_TOKEN\")\n",
+    "    tokenizer.push_to_hub(\"HF_USERNAME/advanced_llama_lora\", token = \"YOUR_HF_TOKEN\")\n"
    ]
   },
   {
@@ -9728,7 +9728,7 @@
     "### GGUF / llama.cpp Conversion\n",
     "To save to `GGUF` / `llama.cpp`, we support it natively now! We clone `llama.cpp` and we default save it to `q8_0`. We allow all methods like `q4_k_m`. Use `save_pretrained_gguf` for local saving and `push_to_hub_gguf` for uploading to HF.\n",
     "\n",
-    "Some supported quant methods (full list on our [Wiki page](https://github.com/unslothai/unsloth/wiki#gguf-quantization-options)):\n",
+    "Some supported quant methods (full list on our [docs page](https://unsloth.ai/docs/basics/inference-and-deployment/saving-to-gguf)):\n",
     "* `q8_0` - Fast conversion. High resource use, but generally acceptable.\n",
     "* `q4_k_m` - Recommended. Uses Q6_K for half of the attention.wv and feed_forward.w2 tensors, else Q4_K.\n",
     "* `q5_k_m` - Recommended. Uses Q6_K for half of the attention.wv and feed_forward.w2 tensors, else Q5_K.\n",
@@ -9745,26 +9745,26 @@
    "outputs": [],
    "source": [
     "# Save to 8bit Q8_0\n",
-    "if False: model.save_pretrained_gguf(\"model\", tokenizer,)\n",
+    "if False: model.save_pretrained_gguf(\"advanced_llama_finetune\", tokenizer,)\n",
     "# Remember to go to https://huggingface.co/settings/tokens for a token!\n",
     "# And change hf to your username!\n",
-    "if False: model.push_to_hub_gguf(\"hf/model\", tokenizer, token = \"\")\n",
+    "if False: model.push_to_hub_gguf(\"HF_USERNAME/advanced_llama_finetune\", tokenizer, token = \"YOUR_HF_TOKEN\")\n",
     "\n",
     "# Save to 16bit GGUF\n",
-    "if False: model.save_pretrained_gguf(\"model\", tokenizer, quantization_method = \"f16\")\n",
-    "if False: model.push_to_hub_gguf(\"hf/model\", tokenizer, quantization_method = \"f16\", token = \"\")\n",
+    "if False: model.save_pretrained_gguf(\"advanced_llama_finetune\", tokenizer, quantization_method = \"f16\")\n",
+    "if False: model.push_to_hub_gguf(\"HF_USERNAME/advanced_llama_finetune\", tokenizer, quantization_method = \"f16\", token = \"YOUR_HF_TOKEN\")\n",
     "\n",
     "# Save to q4_k_m GGUF\n",
-    "if False: model.save_pretrained_gguf(\"model\", tokenizer, quantization_method = \"q4_k_m\")\n",
-    "if False: model.push_to_hub_gguf(\"hf/model\", tokenizer, quantization_method = \"q4_k_m\", token = \"\")\n",
+    "if False: model.save_pretrained_gguf(\"advanced_llama_finetune\", tokenizer, quantization_method = \"q4_k_m\")\n",
+    "if False: model.push_to_hub_gguf(\"HF_USERNAME/advanced_llama_finetune\", tokenizer, quantization_method = \"q4_k_m\", token = \"YOUR_HF_TOKEN\")\n",
     "\n",
     "# Save to multiple GGUF options - much faster if you want multiple!\n",
     "if False:\n",
     "    model.push_to_hub_gguf(\n",
-    "        \"hf/model\", # Change hf to your username!\n",
+    "        \"HF_USERNAME/advanced_llama_finetune\", # Change hf to your username!\n",
     "        tokenizer,\n",
     "        quantization_method = [\"q4_k_m\", \"q8_0\", \"q5_k_m\",],\n",
-    "        token = \"\",\n",
+    "        token = \"YOUR_HF_TOKEN\",\n",
     "    )"
    ]
   },
@@ -9774,20 +9774,20 @@
     "id": "V15Yhj1V9lwG"
    },
    "source": [
-    "Now, use the `model-unsloth.gguf` file or `model-unsloth-Q4_K_M.gguf` file in llama.cpp.\n",
+    "Now, use the `advanced_llama_finetune.Q8_0.gguf` file or `advanced_llama_finetune.Q4_K_M.gguf` file in llama.cpp.\n",
     "\n",
     "And we're done! If you have any questions on Unsloth, we have a [Discord](https://discord.gg/unsloth) channel! If you find any bugs or want to keep updated with the latest LLM stuff, or need help, join projects etc, feel free to join our Discord!\n",
     "\n",
     "Some other links:\n",
     "1. Train your own reasoning model - Llama GRPO notebook [Free Colab](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.1_(8B)-GRPO.ipynb)\n",
     "2. Saving finetunes to Ollama. [Free notebook](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3_(8B)-Ollama.ipynb)\n",
     "3. Llama 3.2 Vision finetuning - Radiography use case. [Free Colab](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.2_(11B)-Vision.ipynb)\n",
-    "6. See notebooks for DPO, ORPO, Continued pretraining, conversational finetuning and more on our [documentation](https://docs.unsloth.ai/get-started/unsloth-notebooks)!\n",
+    "6. See notebooks for DPO, ORPO, Continued pretraining, conversational finetuning and more on our [documentation](https://unsloth.ai/docs/get-started/unsloth-notebooks)!\n",
     "\n",
     "<div class=\"align-center\">\n",
     "  <a href=\"https://unsloth.ai\"><img src=\"https://github.com/unslothai/unsloth/raw/main/images/unsloth%20new%20logo.png\" width=\"115\"></a>\n",
     "  <a href=\"https://discord.gg/unsloth\"><img src=\"https://github.com/unslothai/unsloth/raw/main/images/Discord.png\" width=\"145\"></a>\n",
-    "  <a href=\"https://docs.unsloth.ai/\"><img src=\"https://github.com/unslothai/unsloth/blob/main/images/documentation%20green%20button.png?raw=true\" width=\"125\"></a>\n",
+    "  <a href=\"https://unsloth.ai/docs/\"><img src=\"https://github.com/unslothai/unsloth/blob/main/images/documentation%20green%20button.png?raw=true\" width=\"125\"></a>\n",
     "\n",
     "  Join Discord if you need help + ⭐️ <i>Star us on <a href=\"https://github.com/unslothai/unsloth\">Github</a> </i> ⭐️\n",
     "\n",