Tumblr and WordPress posts will reportedly be used for OpenAI and Midjourney training

Parent company Automattic allegedly included private data in its first batch.

Contributing Reporter

Updated Tue, Feb 27, 2024, 3:56 PM·4 min read

Tumblr and WordPress are reportedly set to strike deals to sell user data to artificial intelligence companies OpenAI and Midjourney. 404 Media reports that the platforms’ parent company, Automattic, is nearing completion of an agreement to provide data to help train the AI companies’ models.

It isn’t clear which data will be included, but the report suggests Automattic may have overreached initially. An alleged internal post from Tumblr product manager Cyle Gage suggests Automattic prepared to send private or partner-related data that wasn’t supposed to be included in the deal. The questionable content reportedly included private posts on public blog posts, deleted or suspended blogs, unanswered (therefore, not publicly posted) questions, private answers, posts marked explicit and content from premium partner blogs (like Apple’s former music site).

The internal post suggests Automattic’s engineers are preparing a list of post IDs that should have been excluded. It isn’t clear whether the data had already been sent to the AI companies.

Engadget emailed Automattic to ask for comment on the report. The company replied with a published statement, claiming, “We will share only public content that’s hosted on WordPress.com and Tumblr from sites that haven’t opted out.” The statement notes that legal regulations don’t currently require AI companies’ web crawlers to abide by users’ opt-out preferences.

The final line of Automattic’s statement appears to align with the reported deals. “We are also working directly with select AI companies as long as their plans align with what our community cares about: attribution, opt-outs, and control,” Automattic wrote. “Our partnerships will respect all opt-out settings. We also plan to take that a step further and regularly update any partners about people who newly opt out and ask that their content be removed from past sources and future training.”

NEW YORK, NEW YORK - DECEMBER 12: Sam Altman speaks onstage during A Year In TIME at The Plaza Hotel on December 12, 2023 in New York City. (Photo by Mike Coppola/Getty Images for TIME) — *OpenAI CEO Sam Altman* (Mike Coppola via Getty Images)

The company reportedly plans to launch a new opt-out tool on Wednesday that claims to allow users to block third parties — including AI companies — from training on their data. 404 Media reviewed an alleged internal FAQ Automattic prepared for the tool, which includes the answer, “If you opt out from the start, we will block crawlers from accessing your content by adding your site on a disallowed list. If you change your mind later, we also plan to update any partners about people who newly opt-out and ask that their content be removed from past sources and future training.”

The phrasing, describing it as “asking” the AI companies to remove the data, may be relevant.

An alleged internal document from Automattic’s AI head, Andrew Spittle, replying to a staff question about data-removal assurances when using the tool, explains, “We will notify existing partners on a regular basis about anyone who’s opted out since the last time we provided a list. I want this to be an ongoing process where we regularly advocate for past content to be excluded based on current preferences. We will ask that content be deleted and removed from any future training runs. I believe partners will honor this based on our conversations with them to this point. I don’t think they gain much overall by retaining it.”

So, if a Tumblr or WordPress user requests to opt out of AI training, Automattic will allegedly “ask” and “advocate for” their removal. And the company’s AI boss “believes” the AI companies will find it in their best interest to comply “based on our conversations.” (How’s that for reassurance!)

AI data training deals have become a lucrative opportunity for websites treading water in today’s slippery online publishing landscape. (Tumblr’s staff was reportedly reduced to a skeleton crew in late 2023.) Last week, Google struck a deal with Reddit (ahead of the latter’s IPO) to train on the platform’s vast knowledge base of user-created content. Meanwhile, OpenAI rolled out a partnership program last year to collect datasets from third parties to help train its AI models.

Update, February 27, 2024, 3:56 PM ET: This story has been updated to add a published statement from WordPress and Tumblr parent company Automattic.

Engadget
I played Fire Emblem Engage on easy mode, and it got me back into gaming
"Sure, winning battles and matches in more difficult modes will feel more rewarding, but not every gaming experience has to be a challenge."
4h ago
Engadget
Apple has reportedly resumed talks with OpenAI to build a chatbot for the iPhone
Apple has resumed talks with OpenAI, the maker of ChatGPT, to build an AI-powered chatbot into the iPhone, according to a new report.
15h ago
Engadget
The FTC accuses Amazon of using Signal’s auto-deleting messages to erase evidence
As part of its antitrust suit against Amazon, the FTC accused the company of using Signal’s disappearing messages feature to conceal communications.
19h ago
Engadget
Drake deletes AI-generated Tupac track after Shakur’s estate threatened to sue
Drake apparently learned it isn’t wise to mess with Tupac Shakur — even nearly three decades after his death. Tthe Canadian hip-hop artist deleted the post with his track “Taylor Made Freestyle,” which used an AI-generated recreation of Shakur’s voice.
21h ago
Engadget
Aaron Sorkin is working on a Jan. 6-focused follow-up to The Social Network
Aaron Sorkin has announced that he’s currently writing a followup script to The Social Network. The original was his take on the initial years of Facebook.
21h ago
Engadget
Samsung's Galaxy S24 Ultra falls to a new low, plus the rest of the week's best tech deals
This week's best tech deals include a new low on the Samsung Galaxy S24 Ultra, Apple's MacBook Air M3 for $989 and Anker's Soundcore Space A40 earbuds for $49, among others.
22h ago
Engadget
Nikon’s Z8 is a phenomenal mirrorless camera for the price
Nikon's Z8 is one of the highest resolution full-frame cameras with 45 megapixels, but is also one of the fastest and has incredible video capabilities too.
22h ago
Engadget
Some of our favorite Bose headphones and earbuds are back to all-time low prices
Amazon has some of the highest-rated Bose headphones on sale for record-low prices. That includes the Bose QuietComfort Ultra headphones, which have best-in-class active noise cancellation (ANC).
22h ago
Engadget
Apple's 13-inch MacBook Air with the M3 chip has never been cheaper
The latest Apple MacBook Air with the M3 chip is down to a new low price at Amazon.
22h ago
Engadget
NHTSA concludes Tesla Autopilot investigation after linking the system to 14 deaths
The National Highway Traffic Safety Administration has concluded a lengthy investigation into Tesla’s Autopilot system. It found 13 fatal crashes due to misuse and software that doesn’t prioritize driver attentiveness.
1d ago
Engadget
Wacom's first OLED pen display is also the thinnest and lightest it has ever made
Wacom's latest pen display model is called Movink, and it's the company's first with a OLED screen. It's also Wacom's thinnest and lightest option ever, while still offering 13 inches of work space.
1d ago
Engadget
It doesn’t matter how many Vision Pro headsets Apple sells
This week, there was a lot of back and forth about Apple Vision Pro production numbers. Here's why they don't matter.
1d ago
Engadget
The Google Pixel Buds Pro are back on sale for $135
Google's Pixel Buds Pro are on sale for $135 at Wellbots, which is the lowest price we've seen this year.
1d ago
Engadget
Dell XPS 13 and XPS 14 review (2024): Gorgeous laptops with usability quirks
Dell’s XPS 13 and 14 are stylish, portable and powerful. You’ll have to get used to some of its design quirks, though, and it’s far pricier than older models.
1d ago
Engadget
OpenAI's Sam Altman and other tech leaders join the federal AI safety board
Sam Altman, OpenAI's CEO, Microsoft chief Satya Nadella, Alphabet CEO Sundar Pichai are joining the government's Artificial Intelligence Safety and Security Board, according to The Wall Street Journal.
1d ago
Engadget
The best gaming gear for graduates
New graduates have earned the time to unwind after a busy year. These pieces of gaming gear would make great gifts for the new college graduate in your life.
a year ago
Engadget
The Morning After: Apple announces an iPad event for May 7
The biggest news stories this morning: Adobe’s new upscaling tech uses AI to sharpen video, BlizzCon 2024 is canceled, The world’s biggest 3D printer can make a house in under 80 hours.
1d ago
Engadget
Engadget Podcast: Why TikTok will never be the same again
Biden passed the TikTok divestment bill -- now what?
1d ago
Engadget
The best wireless earbuds for 2024
It's safe to say the wireless earbuds space is pretty saturated. We've tested and reviewed dozens of models; these are our top picks.
4 months ago
Engadget
Apple is launching new iPads May 7: Here's what to expect from the 'Let Loose' event
Apple has scheduled an event for May 7 that'll more than likely focus on new iPads. Here's what we expect the company to show off.
2d ago

Tumblr and WordPress posts will reportedly be used for OpenAI and Midjourney training

Parent company Automattic allegedly included private data in its first batch.

Latest Stories

I played Fire Emblem Engage on easy mode, and it got me back into gaming

Apple has reportedly resumed talks with OpenAI to build a chatbot for the iPhone

The FTC accuses Amazon of using Signal’s auto-deleting messages to erase evidence

Drake deletes AI-generated Tupac track after Shakur’s estate threatened to sue

Aaron Sorkin is working on a Jan. 6-focused follow-up to The Social Network

Samsung's Galaxy S24 Ultra falls to a new low, plus the rest of the week's best tech deals

Nikon’s Z8 is a phenomenal mirrorless camera for the price

Some of our favorite Bose headphones and earbuds are back to all-time low prices

Apple's 13-inch MacBook Air with the M3 chip has never been cheaper

NHTSA concludes Tesla Autopilot investigation after linking the system to 14 deaths

Wacom's first OLED pen display is also the thinnest and lightest it has ever made

It doesn’t matter how many Vision Pro headsets Apple sells

The Google Pixel Buds Pro are back on sale for $135

Dell XPS 13 and XPS 14 review (2024): Gorgeous laptops with usability quirks

OpenAI's Sam Altman and other tech leaders join the federal AI safety board

The best gaming gear for graduates

The Morning After: Apple announces an iPad event for May 7

Engadget Podcast: Why TikTok will never be the same again

The best wireless earbuds for 2024

Apple is launching new iPads May 7: Here's what to expect from the 'Let Loose' event

About

Sections

Contribute

Buying Guides