Bluesky may have said it won’t use user data to train generative AI, but someone else just published a dataset of million Bluesky posts for “machine learning research”. Already very popular dataset, your data may be scraped
You must log in or register to comment.
The same can and will happen with the Fediverse right?
Probably already happened
deleted by creator
I see. Probably mastodon.social gets scraped, then 🫣
Is that a problem for a proper scraper? Give the machine a list of domains and some hints about the relevant protocols, and then the computer runs until the end of the list.
tbh this can happen with everything now so…
i’m not sure what would be the solution, sadly.