Advertise With Us Report Ads

Reddit Blocks Internet Archive, Limiting Access to Its Vast Trove of Data

LinkedIn
Twitter
Facebook
Telegram
WhatsApp
Email
Reddit is now blocking the Internet Archive's Wayback Machine from accessing most of its content.
In its fight against AI data scraping, Reddit is now blocking the Internet Archive's Wayback Machine from accessing most of its content.

Reddit is taking its fight against AI companies to a new level, and this time, a beloved internet institution is caught in the crossfire. The company has placed new restrictions on the Internet Archive’s Wayback Machine, a move that will severely limit its ability to preserve Reddit’s history.

The Wayback Machine, a nonprofit project that archives the internet, will now only be able to crawl Reddit’s homepage. It will no longer be able to access and save individual posts, comments, subreddit pages, or user profiles. This means a huge part of Reddit’s vast and often chaotic history could be lost forever.

ADVERTISEMENT
3rd party Ad. Not an offer or recommendation by softwareanalytic.com.

The move is part of Reddit’s broader war on AI companies that it says are using its data to train their models without paying for it. Reddit has already signed multi-million dollar deals with Google and OpenAI, and it’s taking a hard line against anyone else trying to scrape its content for free. Reddit appears to believe that AI companies are circumventing its rules by pulling data from the Wayback Machine’s archives.

This is a major reversal for Reddit. Just last year, the company said it would not block “good faith actors” like the Internet Archive. It’s not clear what has changed since then, but the decision is a significant setback for researchers, historians, and anyone who values the preservation of online culture.

ADVERTISEMENT
3rd party Ad. Not an offer or recommendation by softwareanalytic.com.
ADVERTISEMENT
3rd party Ad. Not an offer or recommendation by softwareanalytic.com.
ADVERTISEMENT
3rd party Ad. Not an offer or recommendation by softwareanalytic.com.