I’m curious about what the consensus is here for which models are used for general purpose stuff (coding assist, general experimentation, etc)
What do you consider the “best” model under ~30B parameters?
I’m curious about what the consensus is here for which models are used for general purpose stuff (coding assist, general experimentation, etc)
What do you consider the “best” model under ~30B parameters?
Qwen 2.5 VL and Code. I have a VL doing image captions for LoRA training running now. A 14B is okay for basic code. A quantized 32B 6KL gguf of the same Qwen 2.5 code model runs on 16GB but at a third of the speed of the 14B in bits and bytes 4b. The latter is reasonably fast enough for a couple layers of agentic stuff in emacs with gptel and hits thinking or function calling out of a llama.cpp server better than 50% of the time.
I still haven’t tried the new 20B out of Open AI yet.