minus-squarecurry@programming.devtoSelfhosted@lemmy.world•Consumer GPUs to run LLMslinkfedilinkEnglisharrow-up3·1 day agoI tried to run Gemma 3 27B Q4K and was surprised how quickly the VRAM requirements blew up proportional to context window, especially compared to other models (all quantized) at similar size like Qwq 32B. linkfedilink
I tried to run Gemma 3 27B Q4K and was surprised how quickly the VRAM requirements blew up proportional to context window, especially compared to other models (all quantized) at similar size like Qwq 32B.