Reddit has a warning for AI companies and other scrapers: play by our rules or get blocked. The company said in an update that it plans to update its Robots Exclusion Protocol (robots.txt file), which allows it to block automated scraping of its platform.
The company said it will also continue to block and rate-limit crawlers and other bots that don’t have a prior agreement with the company. The changes, it said, shouldn’t affect “good faith actors,” like the Internet Archive and researchers.
Reddit’s notice comes shortly after multiple reports that Perplexity and other AI companies regularly bypass websites’ robots.txt protocol, which is used by publishers to tell web crawlers they don’t want their content accessed. Perplexity’s CEO, in a recent interview with Fast Company, said that the protocol is “not a legal framework.”
In a statement, a Reddit spokesperson told Engadget that it wasn’t targeting a particular company. “This update isn