• 0 Posts
  • 1 Comment
Joined 3 years ago
cake
Cake day: July 2nd, 2023

help-circle
  • If it’s for a single low frequency workflow that GPU is enough, but you will be limited to small models which are mostly useless unless fine-tuned for your use case. If it’s serving users for the entire company with a big enough model to be useful you would need 192-384GB of VRAM. So a server between $20k and $40k.

    The server will require maintenance, and somebody will have to develop the workflow and integration with your data.

    It’s also important to know what they want to do, a basic embedding model for semantic search would work, agents not so much.