• just_another_person@lemmy.world
      link
      fedilink
      arrow-up
      5
      arrow-down
      1
      ·
      6 days ago

      I’m aware, but why would you self-host an S3-compatible storage implementation when you can host what this tool does via NFS or NBD? Makes no sense.

      • moonpiedumplings@programming.dev
        link
        fedilink
        arrow-up
        11
        arrow-down
        1
        ·
        6 days ago

        the most simple answer is that you selected s3 as your sole storage backend for whatever services/apps you have deployed. But now, you want to integrate something that does not support s3 into your stack. So you’ll look at stuff like this.

        Now, why would you self host S3 in the first place:

        The big reason is security. NFS is a security nightmare that believes connections when they claim to be root. It has no encryption and basically no authentication by default [1]

        NBD is a little better, but both NBD and NFS have the problem in that services run in the kernel, exposing frightening attack surface to privileged parts of linux systems. Also, NBD is block storage and a lot less supported as a storage backend for various services.

        S3 on the other hand, is actually designed to be exposed over the internet. http based, you can use https for encryption, and http basic auth or other methods for authentication trivially. So if you ever end up running something distributed over the internet, NFS is basically not an option, and S3 is a much better solution.

        [1] Was posted to lemmy: https://programming.dev/post/34520407

        • just_another_person@lemmy.world
          link
          fedilink
          arrow-up
          2
          arrow-down
          3
          ·
          6 days ago

          That’s just…a bad idea then. You have every opportunity to not do that, and you’re using a solution like this as a patch to solve for other issues you had. Not a good use-case.

          • moonpiedumplings@programming.dev
            link
            fedilink
            English
            arrow-up
            7
            arrow-down
            2
            ·
            6 days ago

            This is one of the most popular methods to handle apps that are using a storage backend directly over the internet for increased bandwidth, and separation of deployment and state.

            If I’m hosting something like Nextcloud at a massive scale, it simply isn’t feasible to use internal networks because they don’t have enough bandwidth, and overlay/vpn network solutions have too much overhead. A common solution is to just run the services and directly connect to them over the internet. So I point my 10,000 node Nextcloud instance at S3, either my own cluster or somebody else’s and S3 handles encryption while remaining reasonably performant. And scalable.

            Garage uses their software internally for something similar: https://garagehq.deuxfleurs.fr/documentation/design/goals/ . On this page, they describe using it as their matrix image/file storage cache. Same thing there, they probably have a large distributed matrix cluster that needs storage that scales with it, while also being secure over different networks.

            • just_another_person@lemmy.world
              link
              fedilink
              arrow-up
              2
              arrow-down
              3
              ·
              6 days ago

              Okay, as I said before, you had a technical issue you couldn’t fix for one specific use-case. Jumping down to a less efficient (which this is) abstract to solve for that problem isn’t a good solution, ESPECIALLY if you’re self-hosting as you describe.

              If I go to a store to buy hot dogs, and they’re out of hot dogs, I wouldn’t buy hamburgers to cut in half and try to pass it off as hot dogs just because they fit in a hot dog bun.