Do you use it to help with schoolwork / work? Maybe to help you code projects, or to help teach you how to do something?
What are your preferred models and why?
I’ve been experimenting with it for different use cases:
- Standard chat style interface with open-webui. I use it to ask things that people would normally ask ChatGPT. Researching things, vacation plans, etc. I take it all with a grain of salt and also still use search engines
- Parts of different software projects I have using ollama-python. For example, I tried using it to auto summarize transaction data
- Home Assistant voice assistants for my own voice activated smart home
- Trying out code completion using TabbyML
I only have a GeForce 1080 Ti in it, so some projects are a bit slow and I don’t have the biggest models, but what really matters is the self-satisfaction I get by not using somebody else’s model, or that’s what I try to tell myself while I’m waiting for responses.
I’m in the process of trying them out again
Phi4 has been okay for me, and I use deepseek R1 32B quantized for some coding tasks. Both are a lot for my aging m1 MacBook Pro to handle.
Lately Ive been trying deepseek 8b for document summaries and it’s pretty fast but janky.
What I’m working towards is setting up an RSS app and feeding that into a local model (freshRSS I think lets you subscribe to a combined feed) to build a newspaper of my news subscriptions, but that’s not viable until I get a computer to run as a server.
I employ this technique to embellish my email communications, thereby enhancing their perceived authenticity and relatability. Admittedly, I am not particularly adept at articulating my thoughts in comprehensive, well-structured sentences. I tend to favor a more primal, straightforward cognitive style—what one might colloquially refer to as a “meat-and-potatoes” or “caveman” approach to thinking. Ha.
As a voice assistant server for my home assistant setup.
That sounds interesting! Can you describe what software you used for that? And how powerful does the hardware has to be?
I’m currently using it to generate initial contact emails, and generate contextual responses to received replies, for a phishing project at work.
in order to prevent phishing, right? (cue anakin/padme meme)
at work, right?
I currently don’t. But I am ollama-curious. I would like to feed it a bunch of technical manuals and then be able to ask it to recite specs or procedures (with optional links to it’s source info for sanity checking). Is this where I need to be looking/learning?
you might want to look into RAG and ‘long-term memory’ concepts. I’ve been playing around with creating a self-hosted LLM that has long-term memory (using pre-trained models), which is essentially the same thing as you’re describing. Also - GPU matters. I’m using an RTX 4070 and it’s noticeably slower than something like in-browser chatgpt, but I know 4070 is kinda pricey so many home users might have earlier/slower gpu’s.
How have you been making those models? I have a 4070 and doing it locally has been a dependency hellscape, I’ve been tempted to rent cloud GPU time just to save the hassle.
I’m downloading pre-trained models. I had a bunch of dependency issues getting text-generation-webui to work and honestly I probably installed some useless crap in the process, but I did get it to work. LM Studio is much simpler, but less customization(or I just don’t know how to use it all in lm studio). But yea, I’m just downloading pre-trained models and running them in these UI’s (right now I just loaded up ‘deepseek-r1-distill-qwen-7b’ in LM Studio). I also have the nvidia app installed and I make sure my gpu drivers are always up to date.
deleted by creator