gfx1201
Here are 7 public repositories matching this topic...
llama.cpp with native AMD RDNA4 (gfx1201) ROCm 7.11 support - 98.97 tok/s AI inference, competitive with RTX 4070 Ti, 32GB VRAM
-
Updated
Jan 3, 2026 - C++
Run Gemma-4-31B at full 256K context on a $1,400 AMD RDNA4 GPU (gfx1201): TurboQuant KV cache + HIP-graph-safe Flash-Attention for llama.cpp, fully measured on real hardware.
-
Updated
Jun 22, 2026 - Python
Fine-tune your own LLM on an AMD Radeon GPU — the easy, tested way. QLoRA via ROCm on Windows/WSL2 & Linux, a worked Gemma-4 example, a reusable live training dashboard, and a smoke test that proves the loss falls.
-
Updated
Jun 20, 2026 - HTML
Run PyTorch natively on Windows 11 with an AMD RX 9070 XT (RDNA4 / gfx1201) on stable ROCm 7.2.1 — no WSL2, no Linux, no ZLUDA. Exact pinned wheel URLs, runtime env vars, documented RDNA4 pitfalls (broken nightlies, no xformers/flash-attn/bitsandbytes), and real benchmarks for ComfyUI, SD, LLMs, RVC/TTS. Verified on one 9070 XT.
-
Updated
Jun 16, 2026 - Batchfile
Run Google's Gemma 4 31B with MTP speculative decoding natively on Windows on an AMD Radeon AI PRO R9700 (RDNA4/gfx1201) via llama.cpp + ROCm 7.1.1 — the build recipe, the ROCm cmath compile fix, and a KV-precision / recall / reasoning study (q4_0 runs the full 256K context with no measurable quality loss).
-
Updated
Jun 22, 2026 - PowerShell
Improve this page
Add a description, image, and links to the gfx1201 topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the gfx1201 topic, visit your repo's landing page and select "manage topics."