• 1 Post
  • 135 Comments
Joined 3 years ago
cake
Cake day: June 14th, 2023

help-circle
  • Way too much. The Nvidia P40 I scavenged for my homemade AI server runs at 120W (throttled down from 250W default) on its own. Then I’ve got two more PCs running purely as redundant firewalls with automatic failover, pretty unnecessary but if that’s not the sort of thing homelabbing is for then I’m going to keep doing it wrong because I find it fun. Then there’s the minecraft server, which is pretty beefy and also eternally running at max CPU because my niece is a monster who loves spamming spawn eggs and should never be allowed access to creative mode. And I don’t even have the two rack units of disk arrays I bought at auction powered up yet because they need 240V which I don’t have handy. I guess someone could do the math on what 48 enterprise SAS drives will pull if they need to satisfy their curiosity, I’m not sure I want to. I will hook them up someday but for now ignorance is bliss. All I know is it’s a lot, and there’s stuff I’m not even including in this.



  • Short answer: You can’t. Longer answer: Accept it if/when it happens, but don’t make yourself an attractive target, and don’t put yourself in a position where it’s going to cost you money if it does. The hype of LLM scrapers is largely overblown for small personal static pages. LLM training wants fresh, data-heavy content. If they are scraping your smolweb site you’re either updating it with and hosting rich content far too frequently, or it’s an error on their part out of pure ignorance and laziness. That doesn’t mean it can’t happen, but also, what actual harm does it do to have a dozen scrapers hitting your site every second? (this is an exaggeration it’s likely not going to be that bad) How big is your smolweb page and images? A few dozen kilobytes? What’s your bandwidth limit, and what happens when you hit the cap? If you’re worried about hitting the cap too quickly, this can be straightforwardly managed by per-IP rate limiting and throttling if necessary to keep things under a cap and allow fair access to gentler users. But when you’re only hosting small files, most connections have plenty of bandwidth to handle scraping until they realize how pointless it is and give up, and it probably won’t be necessary.

    I run about 20 small websites, all public and searchable, with no protections at all. Most of them are rarely updated and have been static for years, I just checked my traffic logs for the last day: ~14,000 hits. That may sound like a lot, but for a request that takes milliseconds to deliver, a computer sitting around not doing anything for the many seconds in between each of those requests is probably bored. Many different scrapers are obviously buried in that traffic, but they’re not the overwhelming horror that people make them out to be, at least in my experience.

    Anubis potentially makes sense on social media sites like Lemmy that are hosting large numbers of users and user-generated content. This stuff is like manna from heaven for LLM bots. Same with code repositories like forgejo. They are very attractive targets for scrapers, with lots of frequent updates that require frequent scraping and also lots of very large files for it to download and ingest. This will absolutely hammer your bandwidth if the scrapers find you an attractive target and they are stupid (which they are).

    But smolweb? Honestly, I hate to break it to you but nobody cares that much, not even LLMs.



  • I believe you, and I feel for you. The saddest part about AI is how it has tainted all high-effort, carefully organized work to the point that it makes it hard to distinguish between the most trustworthy content and the least trustworthy. We need better tools for information provenance. Like I said, the first thing I did was look into your backgrounds to try to understand “is this some AI slop bot or a real person with a real brain” and everything I looked at suggested it being legit and that’s why I decided to give you the benefit of the doubt. But that doubt is everywhere nowadays. It’s rough out there.



  • That’s great for production AWS managed services, but that still sounds like the opposite of self-hosting to me, I don’t need scaling like that, I’m not lying when I admit I’m using sshfs (which was a slightly tongue-in-cheek counterpoint to s3) and despite everyone dunking on it, it is in fact working perfectly at my scale. I know I’ve been downvoted to purgatory but I still stand by my original comment. I don’t understand why you would need S3 or S3 compatibility in a self-hosting context. The closest someone has come to explaining it is the guy who said choice is good… like, yeah, it’s good to have the choice I guess, but… still doesn’t seem like a great choice for self-hosting. I appreciate you trying to explain it but I feel like everyone is missing the self-hosting context here. For a little home lab I simply don’t see the value. Why are people promoting AWS and AWS-adjacent services here?





  • cecilkorik@lemmy.catoSelfhosted@lemmy.worldKitchenOwl Gone?
    link
    fedilink
    English
    arrow-up
    17
    arrow-down
    9
    ·
    1 month ago

    He’s still right that it’s weird people like you are going to bat to defend them. Microsoft sucks. It must get tiring if you have to call out every inaccurate thing everyone says to try to tear them down. The important take-home message is that we need to tear them down, they suck.

    You don’t see people bothering to defend Epstein, for example. Even though there’s lots of inaccurate stuff going around, there’s enough accurate stuff to be absolutely confident he was an absolute loathsome piece of shit not worth defending. Not worth the effort to defend. Why bother?

    What do you see in Microsoft that you think is worth defending? Github is shit, and it’s evil. Let it go.


  • Private trackers are like the Matrix’s “zion”. When civilization collapses into a dystopian surveillance capitalism hellscape and the AIs and fascist governments take over the net, the last free humans will be hiding in private tracker communities, sharing freely and building a resistance. Will we have mechs with gatling guns? I don’t know, all I can say is I hope so because it looks like we’re going to need it.







  • Absolutely. There are tons of open-licenced, open-weight (the equivalent of open-source for AI models) models capable of what is called “tool usage”. The key thing to understand is that they’re never quite perfect, and they don’t all “use tools” quite as effectively or in the same way as each other. This is common to LLMs and it is critical to understand that at the end of the day they are just text generators, they do not “use tools” themselves. They create specific structured text that triggers some other software, typically called a harness but could also be called a client or frontend, to call those tools on your system. Openclaw is an example of such a harness (and not a great or particularly safe one in my opinion but if you want to be a lunatic and give an AI model free reign it seems to be the best choice) You can use commercial harnesses too by configuring or tricking them into connecting to a local model instead of their commercial one, although I don’t recommend this for a variety of reasons if you really want to use claude code itself people have done it but I don’t find it works very well since all its prompts and tool calling is optimized for Claude models. Besides OpenClaw, Other popular harnesses for local models include OpenCode (as close as you’re going to get to claude for local models) or Cursor, even Ollama has their own CLI harness now. Personally I use OpenCode a lot but I am starting to lean towards pi-mono (it’s just called pi but that’s ungoogleable) it is very minimal and modular, making it intentionally easy to customize with plugins and skills you can automatically install to make it exactly as safe or capable or visual as you wish it to be.

    As a minor diversion we should also discuss what a “tool” is, in this context there are some common basic tools that some or most tool-use models will have or understand some variation of, out of the box. Things like editing files, running command-line tools, opening documents, searching the web, are common built-in skills that pretty much any model advertising itself capable of “tool use” or “tool calling” will support, although some agents will be able to use these skills more capably and effectively than others. Just like some people know the Linux commandline fluently and can completely operate their system with it, while others only know basic commands like ls or cat and need a GUI or guidance for anything more complex, AI models are similar, some (and the latest models in particular) are incredibly capable with even just their basic built-in tools. However they’re not limited by what’s built in, as like I said, they can accept guidance on what to use and how to use it. You can guide them explicitly if you happen to be fluent in their tools, but there are kind of two competing models for how to give them that guidance automatically. These are MCP (model context protocol) which is a separate server they can access that provides structured listings of different kinds of tools they can learn to use and how they work, basically allowing them to connect to a huge variety of APIs in almost any software or service. Some harnesses have an MCP built-in. The other approach is called “skills” and seems to be (to me) a more sensible and flexible approach to giving the AI model enough understanding to become more capable and expand the tools it can use. Again, providing skills is usually something handled by the harness you’re using.

    To make this a little less abstract you can put it in perspective of Claude: Anthropic provides several different Claude models like Haiku, Sonnet, and Opus. These are the text-generation models and they have been trained to produce a particular tool usage format, but Opus tends to have more built-in capability than something like Haiku for example. Regardless of which model you choose though (and you can switch at any time) you’ll be using a harness, typically “claude code” which is typically the CLI tool most people use to interact with Claude in an agentic, tool calling capacity.

    On the open and local side of the landscape, we don’t have anything quite as fast or capable as Claude code unfortunately, but we can do surprisingly okay considering we’re running small local models on consumer hardware, not massive data center farms being enticingly given away or rented for pennies on the dollar of what they’re actually costing these companies on the hopes of successful marketshare-capture and vendor-lock-in leading to future profits.

    Here are some pretty capable tool-use models I would recommend (most should be available for download through ollama and other sources like huggingface)

    • gemma4 (the latest and greatest hotness, MIT licensed using TurboQuant to deliver pretty incredible capability, performance and results even with limited VRAM)
    • qwen3.5 (from Alibaba, a consistent and traditional leader in open models so far with good capability and modest performance)
    • qwen3-coder-next (a pretty huge coding-focused model you might struggle to run unless you have a very beefy system and GPU)
    • glm4.7-flash (a modestly capable and reasonably fast option)
    • devstral-small-2 (an older, not-so-small variant of mistral, the French open-weight AI model if you’re looking for a non-Chinese, non-US based model which are few and far between)


  • You can do all those things with proper routing and there is no difference from mobile devices (as long as they use DHCP and what mobile device wouldn’t?). What I’m suggesting does not change anything on the public side. You still authenticate publicly to renew your certificates. You still give the same certificates to both public and local networks. They’re still valid. Nothing changes.

    The only difference is that when you’re local, your DNS gives you the correct local IP address where that service is hosted, say, 192.168.12.34 instead of using public DNS, getting an external IP that’s on the wrong side of the router, and having to go outside your own network and come back in. Hairpin is like that simpsons episode where Abe goes in the revolving door, takes off his hat, puts his hat back on, and goes back out the same revolving door in the span of 2 seconds. It’s pointless, why are you doing that? If you didn’t want to be on the outside of the network, why are you going to the outside of the network first? Just stay inside the network. Get the right IP. No hairpin routing needed. No certificate madness needed. Everything just works the way its supposed to (because this is in fact the way it’s supposed to work)