What local, small models are you all using?

devxyn@sh.itjust.works · 6 days ago

What local, small models are you all using?

FrankLaskey@lemmy.ml · 6 days ago

In my opinion, Qwen3-30B-A3B-2507 would be the best here. Thinking version likely best for most things as long as you don’t mind a slight penalty to speed for more accuracy. I use the quantized IQ4_XS models from Bartowski or Unsloth on HuggingFace.

I’ve seen the new OSS-20B models from OpenAI ranked well in benchmarks but I have not liked the output at all. Typically seems lazy and not very comprehensive. And makes obvious errors.

If you want even smaller and faster the Qwen3 Distill of DeepSeek R1 0528 8B is great for its size (esp if you’re trying to free up some VRAM to use larger context lengths)

devxyn@sh.itjust.works · 6 days ago

That’s what I’m using, and it’s pretty nice. Thanks for your input!