Did you know Leaked Emails Expose Meta’s Massive AI Training Piracy Scandal
Tech giant Meta was accused last month of training its AI systems illegally. The lawsuit spoke about Facebook’s parent firm using
pirated content such as articles and ebooks to get the job done.
Now,
the latest on this front has to do with unsealed emails providing the
latest evidence on this front and how Meta engaged in the shocking act.
The case was rolled out by book authors and how the new findings are a
new breakthrough in the lawsuit, all thanks to the latest round of
leaked communications.
The emails
brought to light how Meta did admit to the controversial act and how it
torrented a major dataset dubbed LibGen that entails tens of millions
of pirated material. As per the authors’ filing, Meta used 81.7 TB of
information spread out over several shadow libraries through Anna’s
Archive. This includes 35.7TB of information from the Z-library and
LibGen. Other than that, the firm says it was previously torrenting
80.6TB of information through the LibGen.
The emails displayed how employees at Meta were well aware of the legal risks attached to the actions. Then in 2023, a leading research engineer at the firm mentioned how torrenting from a company’s laptop did not feel like it was the right step.
In the internal message, engineer Nikolay Bashlykov expressed serious issues linked to using Meta IP addresses for loading pirated material and corporate laptops for the same reason. Then in September of 2023, he went one step ahead to protest more and consult with legal experts on the matter.
Through
torrents, seeding files would be possible which means sharing the data
with the outside world. And he noted that such acts were legally not
acceptable. Despite so many warnings, the authors kept on arguing about
how the tech giant knew what it was getting into but still chose to
carry on with illegal actions.
They tried to hide the activity
by making edits to settings so even a tiny fraction of seeding could
take place. Additionally, Meta tried to avoid the risks of anyone being
able to track the downloader or seeder involved. They did this by
installing the data to servers that were non-Meta-owned.