Wondering about services to test on either a 16gb ram “AI Capable” arm64 board or on a laptop with modern rtx. Only looking for open source options, but curious to hear what people say. Cheers!
Wondering about services to test on either a 16gb ram “AI Capable” arm64 board or on a laptop with modern rtx. Only looking for open source options, but curious to hear what people say. Cheers!
I’ve an old gaming PC with a decent GPU laying around and I’ve thought of doing that (currently use it for linux gaming and GPU related tasks like photo editing etc) However ,I’m currently stuck using LLMs on demand locally with ollama. Energy costs of having it powered on all time for on demand queries seems a bit overkill to me…
I put my Plex media server to work doing Ollama - it has a GPU for transcoding that’s not awful for simple LLMs.
That sounds like a great way of leveraging existing infrastructure! I host Plex together with other services in a server with intel transcoding capable CPU. I’m quite sure I would get much better performance with the GPU machine, might end up following this path!
Have to agree on that. Certainly only makes sense to have up when you are using it.