• ironsoap@lemmy.one
    link
    fedilink
    English
    arrow-up
    0
    ·
    9 months ago

    Dumb question for the Lemmy lawyers, if enough redditors joined could a class action lawsuit be filed to be paid for their content… Or is that so outside of the TOS that it’s not worth considering?

      • jarfil@beehaw.org
        link
        fedilink
        arrow-up
        0
        ·
        9 months ago

        Reddit doesn’t “own” the content, TOS only have users agree to give Reddit a license to do as it pleases.

        • TexMexBazooka@lemm.ee
          link
          fedilink
          arrow-up
          0
          ·
          9 months ago

          Ah, right they don’t own it! It’s just stored on their servers, and they have exclusive rights to do whatever they’d like with it. But they don’t own it.

      • Echo Dot@feddit.uk
        link
        fedilink
        arrow-up
        0
        ·
        edit-2
        9 months ago

        However It gets interesting because under EU law TOS that violate GDPR are not enforceable. So at least EU citizens could probably have some recourse.

  • DeltaTangoLima@reddrefuge.com
    link
    fedilink
    English
    arrow-up
    0
    ·
    9 months ago

    And that’s why I deleted all my posts and comments before deleting my account. Sure, they could probably go back and restore it if they wanted but, so far, they haven’t.

    Glad I landed here on Lemmy.

    • DaleGribble88@programming.dev
      link
      fedilink
      English
      arrow-up
      0
      ·
      9 months ago

      Yeah! Here, no one gets paid when someone else wants to profit off of all the free user generated content. Wait, what was our goal again?

    • Skull giver@popplesburger.hilciferous.nl
      link
      fedilink
      English
      arrow-up
      0
      ·
      9 months ago

      On Lemmy all you need to do is follow every community you can find and you’ll get a stream of posts, comments, voting behaviour, edits, and even admin behaviour, all raw and unprocessed with all the metadata you could hope for without paying a penny.

      I’m not saying every Lemmy server is being used to train AI models, but I’m sure the big ones are.

      • Echo Dot@feddit.uk
        link
        fedilink
        arrow-up
        0
        ·
        9 months ago

        Presumably most of the current AI models have already had access to reddit data in the past, so I am a bit confused about why they would pay 60 million for it now.

    • Phen@lemmy.eco.br
      link
      fedilink
      arrow-up
      0
      ·
      9 months ago

      I deleted all my comments last year. Recently I got a notification for a response in one of such comments. When I clicked the notification link, my comment and the response were visible. The comment doesn’t show up in my profile.

      • Hubi@feddit.de
        link
        fedilink
        arrow-up
        0
        ·
        9 months ago

        I’ve had the same experience. Most scripts just erase the comments available directly through your reddit profile, which is limited to the most recent ~2000 posts that you’ve made. To fully erase anything and everything, you need to request all your data from reddit, download the .zip and feed it into an application like shreddit.

      • DeltaTangoLima@reddrefuge.com
        link
        fedilink
        English
        arrow-up
        0
        ·
        edit-2
        9 months ago

        Interesting. I’ve specifically searched for some fairly unique content (Python scripts, etc) I posted in my time over there, and it hasn’t shown up at all.

        So you left your Reddit account intact?

        Edit: Fucking. Cunts. I just searched (had been a few months) and at least some of my data is back. I reckon they’ve done it ahead of the planned AI move and IPO.

      • thatsnothowyoudoit@lemmy.ca
        link
        fedilink
        arrow-up
        0
        ·
        edit-2
        9 months ago

        Reddit was aggressively rate limiting tools used to delete and edit content in a funny way when the API pricing was announced. The API wouldn’t return an error, the rate limiting was silent, and the tools would report successful deletion or edits even when the edit or deletion wasn’t made.

        I had to modify an existing script to handle the 5-second rate limit and, lieu of deleting, I just rewrote each comment with a farewell.

        Even then I did 3 passes (minor additional edits) in cases Reddit was saving previous edits.

        My content has stayed edited.

        • dubyakay@lemmy.ca
          link
          fedilink
          arrow-up
          0
          ·
          edit-2
          9 months ago

          Do you still have the Python script available?

          I was fine with keeping my comments up before for the future searchers, but I’m not fine with that shithole making profit off of it.

    • bevan@lemmy.nz
      link
      fedilink
      arrow-up
      0
      ·
      9 months ago

      Yep used ‘power delete suite’ to delete everything before I left.

      • sunbeam60@lemmy.one
        link
        fedilink
        arrow-up
        0
        ·
        9 months ago

        I suspect Reddit holds a perfect copy of every edit, including the first, you’ve ever done. For legal reasons if nothing else. Now also to prevent against perfectly good AI training content to be deleted.

  • Evil_Shrubbery@lemm.ee
    link
    fedilink
    arrow-up
    0
    ·
    edit-2
    9 months ago

    Just in time to make new AI generated shitposts with AI generated replies & pump up those numbers for the IPO.

    Can’t wait to read a post about how a novice AI finds it hard to animate human hands and some other AI suggest studying hentai porn to get the finger/tentacles movements just right. And ofc lots of ads. From AIs, to AIs, by AIs, for AIs.

  • comicallycluttered@beehaw.org
    link
    fedilink
    arrow-up
    0
    ·
    edit-2
    9 months ago

    Lol, so they’re going to be training their AI on… AI generated content? The uptick in that shit on reddit has been made it more annoying than usual.

    That and all the confidently incorrect shit on the site… Not to mention the constant in-jokes. I’m just imagining a chatbot responding to something about how to deal with grief with “I also choose this man’s dead wife!”

    Can’t see how this could possibly go wrong.

    • mob@sopuli.xyz
      link
      fedilink
      arrow-up
      0
      ·
      9 months ago

      60 million a year for access to the relatively public data… That seems pretty good to me tbh.

      • fine_sandy_bottom@discuss.tchncs.de
        link
        fedilink
        arrow-up
        0
        ·
        9 months ago

        Maybe, but with people are saying reddit’s main value proposition is access to AI training data, and that reddit is worth n billion dollars, $60m seems like a pittance.

          • fine_sandy_bottom@discuss.tchncs.de
            link
            fedilink
            arrow-up
            0
            ·
            9 months ago

            No, it’s really not.

            Firstly, while the data may be public, it’s not “free”. Scraping reddit and using it to train an AI would likely contravene their terms of use, you’d end up facing similar copyright issues that the current generation of bots has.

            Secondly, scraped data would be incomplete, you wouldn’t get anything edited or “deleted”, which would surely be available if you paid them. The edits and deletes would be very valuable for AI training.

            Thirdly, you would get the meta that reddit has. Geolocation, user agent, alt accounts, browsing habits, et cetera.

            Fourthly, you wouldn’t get exclusivity. Locking out a competitor is worth something.

            • mob@sopuli.xyz
              link
              fedilink
              arrow-up
              0
              ·
              9 months ago

              Idk why you are talking about scraping when I said API?

              And is all that information in the training contract?

              • fine_sandy_bottom@discuss.tchncs.de
                link
                fedilink
                arrow-up
                0
                ·
                9 months ago

                I assumed that when you said “it’s just an API” you were saying you’re paying $60m for an API as opposed to scraping for free.

                Is all what information in the training contract?

    • Evil_Shrubbery@lemm.ee
      link
      fedilink
      arrow-up
      0
      ·
      9 months ago

      Yeah, the diarrhea of my shitposts over there alone is worth more, it’s what will make the future AI kinda smart & very depressed.

      • my layman understanding would be, that they include it in the TOS and your only option would be to leave the platform and demand them to delete all your content, which they may or may not do. E.g. they could just train the AI on an older backup. Good luck getting your rights recognized and abided by.

      • And009@lemmynsfw.com
        link
        fedilink
        English
        arrow-up
        0
        ·
        9 months ago

        It doesn’t, as soon as you post on reddit it becomes ‘content’ on their social media.

        • Kichae@lemmy.ca
          link
          fedilink
          English
          arrow-up
          0
          ·
          9 months ago

          No, the user owns it, but by creating an account you provide Reddit a license to use that content in certain ways.

          So, it’s yours, but you’ve agreed to let them do whatever they want with it as if it’s theirs, too.

          • And009@lemmynsfw.com
            link
            fedilink
            English
            arrow-up
            0
            ·
            9 months ago

            Yes, as we left reddit, the option to delete everything and leave a memorable ‘fuck u/spez’ was always ours.

    • Natanael@slrpnk.net
      link
      fedilink
      arrow-up
      0
      ·
      9 months ago

      No, it was just preemptive to enforce control over who can programmatically read the site

  • neocamel@lemmy.studio
    link
    fedilink
    arrow-up
    0
    ·
    edit-2
    9 months ago

    Sounds like it’s time for me to actually log back in and delete all my old posts. I’ve been putting that off for too long.

    • Hubi@feddit.de
      link
      fedilink
      arrow-up
      0
      ·
      edit-2
      9 months ago

      And the outputs of bots. There has been a shocking increase in auto-generated comments on reddit in the past years and it’s turning the training data into a minefield.

      • nul@programming.dev
        link
        fedilink
        arrow-up
        0
        ·
        9 months ago

        Haven’t touched reddit socially in 8 months, but every now and then I’ll use it to search for opinions or instructions on things. Searched “reddit best domain registrar” recently and landed on a thread where top to bottom, every comment recommending a registrar was from a bot and/or banned account. No real person testimonials, all ads. And as AI implementations improve, that’s going to get harder to spot. In the meantime, I’m formatting searches like “best domain registrar lemmy” because reddit is legit that bad rn.