Killer Charts

Killer Charts

Where does AI get it's info from

Five charts to start your day

James Eagle's avatar
James Eagle
Aug 19, 2025
∙ Paid
Share

Good morning – here are your five chart for the day.

Reddit has always been one of my favourite social media platforms. What I like about it, is that it is both unfiltered and clever in the way that it rewards content.

If people like what you have posted then you get an up vote. And if people don't like what you've posted you are down voted. It's very transparent in this way so you don't end up getting seduced by the echo chamber effect like on LinkedIn, where your followers are the only ones who post comments and like your posts.

But there is a darker side to Reddit and this is nothing new. For years Reddit has been a place to data mine for information with simple Python code. Nowadays, AI is using Reddit as a way to collect as much intelligence as possible as training data. Understandably, this has some people worried.

Reddit rules the AI training universe

Reddit has become the unexpected kingmaker of artificial intelligence, with 40.1% of all AI model citations now coming from the platform's chaotic threads and comment sections. This dominance dwarfs Wikipedia's 26.3% share and leaves YouTube trailing at 23.5%, fundamentally reshaping how AI systems learn about our world.

The implications are staggering. Reddit's $60 million annual data licensing deal with Google represents just the beginning of a gold rush, as every major AI company scrambles for authentic human conversation data. The platform's 73 million daily active users are unknowingly training the next generation of ChatGPT, Claude and Gemini models with their arguments about everything from stock tips to sandwich recipes.

This creates a fascinating paradox. While Reddit's unfiltered discussions provide AI with genuine human perspectives, they also inject bias, misinformation and the occasional conspiracy theory directly into machine learning systems. Companies are desperately trying to filter signal from noise, but with Reddit commanding such market share, the platform's collective wisdom – and stupidity – increasingly defines how AI understands reality.

Source: Statista

Want the other four? Become a paid subscriber.


Keep reading with a 7-day free trial

Subscribe to Killer Charts to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 James Eagle
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture