Investigation: Apple, Nvidia, Anthropic, and others trained their AI on a dataset containing YouTube video transcripts, including from the WSJ, MrBeast, and MIT
Creators claim their videos were used without their knowledge — AI companies are generally secretive about their sources of training data …
Proof
Related Coverage
- Creators angry after tech giants are found using YouTube content to train AI models Cybernews.com · Konstancija Gasaityt&eabove;
- If you think AI labs wouldn't stoop to using scraped YouTube subtitles for training, think again The Register · Matthew Connatser
- The Morning After: AI models from Apple, NVIDIA and more were reportedly trained on YouTube videos Engadget · Mat Smith
- Big tech firms have reportedly used thousands of YouTube videos to train AI Computerworld · Viktor Eriksson
- Apple and AI companies accused of scanning YouTube data without permission — here's what we know Tom's Guide · Ryan Morrison
- Investigation Finds Apple, NVIDIA, Anthropic Used YouTube Transcripts for AI Training WinBuzzer · Luke Jones
- Big tech firms secretly used YouTube videos to train AI. Ben's Bites
- YouTube creators surprised to find Apple and others trained AI on their videos Ars Technica · Samuel Axon
- Apple trained AI models on YouTube content without consent; includes MKBHD videos 9to5Mac · Ben Lovejoy
- Stop Indexing! And Pay Up! Beyond Search · Stephen E. Arnold
- Apple and Anthropic Use YouTube Videos to Train AI Models GreekReporter.com · Abdul Moeed
- Apple, Nvidia, Anthropic Accused Of Using YouTube Videos To Train AI Models Without Creators' Consent: ‘This Is Going to Be An Evolving Problem For A Long Time,’ Says MKBHD Benzinga
- Is your YouTube content powering AI? Apple, NVIDIA's practices exposed NewsBytes · Mudit Dube
- AI giants stolen training data revealed The Rundown AI · Rowan Cheung
- Looks like no one really cares about YouTube's rules on AI training TechIssuesToday.com · Dwayne Cubbins
- Apple, Anthropic, other tech companies under scanner for using YouTube videos to train AI, report says Financial Express
- Apple reportedly used videos without permission from late night hosts and others to train AI PhoneArena · Alan Friedman
- AI companies reportedly used YouTube video transcripts for training Android Headlines · Alap Naik Desai
- Report claims that Anthropic, Nvidia, Apple and Salesforce used YouTube transcripts to train AI SiliconANGLE · Duncan Riley
- Tech giants use YouTube subtitles for AI training without permission Cryptopolitan · Brenda Kanana
- Investigation finds companies are training AI models with YouTube content without permission TechRadar · Eric Hal Schwartz
- Apple, Nvidia, and others trained their AI on YouTube content without user consent or knowledge TechSpot · Cal Jeffrey
- Apple and Salesforce AI training datasets co-opt MrBeast, Marques Brownlee videos Mashable · Elizabeth de Luna
- AI training dataset used by tech giants allegedly created by scraping YouTube videos in violation of terms Bitcoin Insider · Mike Dalton
- Apple was just caught training AI on YouTube videos without consent Digital Trends · Andrew Tarantola
- Apple, Nvidia, and other tech companies trained AI with thousands of YouTube videos Quartz · Britney Nguyen
- Apple, Nvidia, Anthropic in hot waters over AI trained using controversial YouTube sources Neowin · Aditya Tiwari
- Nvidia, Apple AI Scraped Dataset With 173K YouTube Videos, Taylor Swift Lyrics PCMag · Kate Irwin
- Apple, Anthropic, and Nvidia caught using YouTube subtitles for AI training iThinkDifferent · Asma Hussain
- Tech giants allegedly used thousands of YouTube videos for AI training without creators' consent The Decoder · Matthias Bastian
- Apple, Nvidia, Anthropic Using Unauthorized YouTube Videos for AI Training iPhone in Canada Blog · Usman Qureshi
- The Prompt: The Mysterious Rise Of Trump's AI Whisperer Forbes · Rashi Shrivastava
- Tech Firms Including Apple Caught Using YouTube Data to Train AI Models PetaPixel · Matt Growcoot
- Apple, NVIDIA, Salesforce & more accused of scraping YouTube videos to train AI models Shacknews · TJ Denzer
- AI companies used YouTube videos without permission to train models Stack Diary · Alex Ivanovs
- Tech companies used thousands of swiped YouTube videos to train AI Nieman Lab
- YouTube creators are up in arms because it turns out Anthropic, Apple & Nvidia have trained their AIs on subtitles from their videos. — This fundamentally misunderstands how the web works. Crawling the web to build apps is what it's built on from day one. — https://www.proofnews.org/... @carnage4life@mas.to · Dare Obasanjo
- Wondering if your favorite YouTubers' videos were used to train AI? Now you can find out. The latest investigation from @proof__news reveals … Craig Newmark
Discussion
-
@mkbhd
Marques Brownlee
on x
Apple has sourced data for their AI from several companies One of them scraped tons of data/transcripts from YouTube videos, including mine Apple technically avoids “fault” here because they're not the ones scraping But this is going to be an evolving problem for a long time
-
@mkbhd
Marques Brownlee
on x
Fun fact, I pay a service (by the minute) for more accurate transcriptions of my own videos, which I then upload to YouTube's back-end. So companies that scrape transcripts are stealing *paid* work in more than one way. Not great.
-
@mysk_co
@mysk_co
on x
It's not that tough to win a privacy competition against @googlechrome. This is how Safari's Privacy Nutrition Label compares to another browser such as @brave: [image]
-
@sarafischer
Sara Fischer
on x
NEW: @apple has brought on @taboola to sell ads on its behalf across Apple News and Stocks apps - Taboola will be the exclusive ad seller globally where the apps are available. - NBCU will continue to sell ads in some markets https://www.axios.com/...
-
@leonieclaude
Roberta Fischli
on x
“Our investigation found that subtitles from 173,536 YouTube videos, siphoned from more than 48,000 channels, were used by Silicon Valley heavyweights, including Anthropic, Nvidia, Apple, and Salesforce.” @WIRED @proof__news https://www.wired.com/...
-
@dwiskus
Dave Wiskus
on x
Proof News has a great video exposé — yay real video journalism — on creator work being stolen from YouTube to power machines designed to (poorly) replace us https://www.youtube.com/...
-
@neilturkewitz
Neil Turkewitz
on x
“It's ‘disrespectful’ to use creators' work without their consent, especially since studios may use ‘GenAI to replace as many of the artists along the way as they can...Will this be used to exploit & harm artists? Yes, absolutely.’” —@dwiskus https://www.wired.com/...
-
@proof__news
@proof__news
on x
Our latest investigation reveals a dataset of more than 170,000 YouTube video subtitles that big tech companies used to train their AI models. “Will this be used to exploit and harm artists? Yes, absolutely,” says @dwiskus. https://www.proofnews.org/...
-
@neilturkewitz
Neil Turkewitz
on x
This is hysterical. “Among the videos used by AI companies are 146 from Einstein Parrot.” It's a channel featuring an actual parrot. Life imitates stochastic parrots! We have come full circle. @timnitGebru @emilymbender @mmitchell_ai
-
@_felixsimon_
Felix M. Simon
on x
Political economy of AI and news, latest: ... “Proof News found some of the wealthiest AI companies [...] have used material from thousands of YouTube videos to train AI. Companies did so despite YouTube's rules against harvesting materials from the platform without permission.”
-
@sophiasgaler
Sophia Smith Galer
on x
This is extremely depressing.
-
@fb_bmb
Mathew Buck
on x
Just discovered at least one of my videos has been used to train AI. Needless to say, I did not consent to this and was not aware of this until now. [image]
-
@juliaangwin
Julia Angwin
on x
Huge investigation from @proof__news today: We reveal the trove of YouTube videos that are being used to train AI models (including Anthropic's Claude). Yes, it includes all your favorite YouTubers - from @hankgreen to @MrBeast to @khanacademy. https://www.proofnews.org/...
-
@_felixsimon_
Felix M. Simon
on x
Using @Proof__news tool to search the data set, it's easy find some well-known publishers swept up in this, too, with both @FT & @guardian videos (including @johnharris1969) part of the dataset. [image]
-
r/youtubedrama
r
on reddit
Several YouTubers Had Their Vidoes Scraped to Train AI Tools for Apple, Nvidia, and Others
-
r/technology
r
on reddit
Apple, Nvidia, Anthropic Used Thousands (173,536) of Swiped YouTube Videos to Train AI
-
r/technews
r
on reddit
Apple, Nvidia, Anthropic Used Thousands of Swiped YouTube Videos to Train AI