Social Media and Image Hosting Companies Licensing User Content to Train AI

On April 6, 2022, Elvis Dunderhoff realized that AI can produce different styles of art. He must have watched a YouTube video. -Editor

More stupid nonsense, coming at you at digital speeds.


At its peak in the early 2000s, Photobucket was the world’s top image-hosting site. The media backbone for once-hot services like Myspace and Friendster, it boasted 70 million users and accounted for nearly half of the U.S. online photo market.

Today only 2 million people still use Photobucket, according to analytics tracker Similarweb. But the generative AI revolution may give it a new lease of life.

CEO Ted Leonard, who runs the 40-strong company out of Edwards, Colorado, told Reuters he is in talks with multiple tech companies to license Photobucket’s 13 billion photos and videos to be used to train generative AI models that can produce new content in response to text prompts.

He has discussed rates of between 5 cents and $1 dollar per photo and more than $1 per video, he said, with prices varying widely both by the buyer and the types of imagery sought.

Why would anyone pay that when they can just pull the images off of Google Images for free?

What am I even reading here?

Is this just nonsense news?

“We’ve spoken to companies that have said, ‘we need way more,’ Leonard added, with one buyer telling him they wanted over a billion videos, more than his platform has.

“You scratch your head and say, where do you get that?”

Photobucket declined to identify its prospective buyers, citing commercial confidentiality. The ongoing negotiations, which haven’t been previously reported, suggest the company could be sitting on billions of dollars’ worth of content and give a glimpse into a bustling data market that’s arising in the rush to dominate generative AI technology.

Tech giants like Google, Meta and Microsoft-backed OpenAI initially used reams of data scraped from the internet for free to train generative AI models like ChatGPT that can mimic human creativity. They have said that doing so is both legal and ethical, though they face lawsuits from a string of copyright holders over the practice.

Yeah, but the lawsuits are retarded. Even if they lose, they can just buy the data from the Chinese, who aren’t going to respect copyright.

Or they can pay billions to Photobucket (for content that Photobucket only “owns” because of dubious copyright practices).

At the same time, these tech companies are also quietly paying for content locked behind paywalls and login screens, giving rise to a hidden trade in everything from chat logs to long forgotten personal photos from faded social media apps.

There is a rush right now to go for copyright holders that have private collections of stuff that is not available to be scraped,” said Edward Klaris from law firm Klaris Law, which says it’s advising content owners on deals worth tens of millions of dollars apiece to license archives of photos, movies and books for AI training.

Reuters spoke to more than 30 people with knowledge of AI data deals, including current and former executives at companies involved, lawyers and consultants, to provide the first in-depth exploration of this fledgling market – detailing the types of content being bought, the prices materializing, plus emerging concerns about the risk of personal data making its way into AI models without people’s knowledge or explicit consent.

OpenAI, Google, Meta, Microsoft, Apple and Amazon all declined to comment on specific data deals and discussions for this article, although Microsoft and Google referred Reuters to supplier codes of conduct that include data-privacy provisions.

Google added that it would “take immediate action, up to and including termination” of its agreement with a supplier if it discovered a violation.

Well, Google owns YouTube, which is by far the biggest image database in the world.

They own the “copyright” on all those videos you uploaded there. Because copyright law is retarded.

Many major market research firms say they have not even begun to estimate the size of the opaque AI data market, where companies often don’t disclose agreements. Those researchers who do, such as Business Research Insights, put the market at roughly $2.5 billion now and forecast it could grow close to $30 billion within a decade.

Photobucket CEO Leonard says he is on solid legal ground, citing an update to the company’s terms of service in October that grants it the “unrestricted right” to sell any uploaded content for the purpose of training AI systems. He sees licensing data as an alternative to selling ads.

Oh okay, Jew.

“We need to pay our bills, and this could give us the ability to continue to support free accounts,” he said.

Photobucket is not alone among platforms in embracing licensing. Tumblr’s parent company Automattic said last month it was sharing content with “select AI companies.” In February, Reuters reported Reddit struck a deal with Google to make its content available for training the latter’s AI models.

Ahead of its initial public offering in March, Reddit disclosed that its data-licensing business is the subject of a U.S. Federal Trade Commission inquiry and acknowledged it could fall foul of evolving privacy and intellectual-property regulations.

The FTC, which warned businesses in February against retroactively changing terms of service for AI usage, declined to comment on the Reddit inquiry or say whether it was looking into other training data deals.

Intellectual property is a stupid, evil law that does nothing but stifle creativity.

The fact that Photobucket owns all this content that random people uploaded just shows that these laws do not actually protect anyone, they are just a racket for corporations to sell other people’s work.

Copyright law, like the age of consent, needs to be tossed into the dustbin of history.