Why I Don’t Block AI Scrapers
Published on August 29, 2024, filed under Development and Everything Else (RSS feed for all categories).
The basic contract of the Web seems to have been called, with AI scrapers taking content from everywhere to train their models, regardless of content licenses and preferences, without attribution or compensation.
For an increasing number of site and content owners, this has since meant to block AI scrapers. (For that purpose, there are also increasingly better helpers, like Dark Visitors.)
I, for my part, running sites like meiert.com, Frontend Dogma, and WebGlossary.info, have first tried but ultimately stopped excluding and blocking AI scrapers.
With scrapers starting off ignoring robots.txt directives, and us keeping on seeing existing and new scrapers that ignore robots.txt preferences, the approach isn’t only not working well—AI companies have probably changed the game for good.
Personally, I’m not going to engage in an arms race in which more and more scrapers are being tried to be blocked. I rather watch this unfold legally.
Just like on your websites, the content on my websites is under specific licenses. While usually generous, some require attribution, and others specifically cover derivative use. Still, even where there’s no license specified, it’s not anyone else’s content.
So what I’m betting on instead, is more legal action—by other businesses, and other corporate interests—against what looks like theft.
Will this take a long time to have an effect? Very likely so.
Could this mean the respective work will never get attributed, and their owner—here I—never be compensated for it? That seems likely, too.
Will one even be able to join any cases, to invoke one’s rights? Given how we think about law in Europe (with no few class actions), probably not even that.
Still, let’s face it: If anyone walks around and copies content, to reuse it and resell it—then that’s theft regardless of whether you had put up a sign, “no stealing, please.” And as there hasn’t even been an unwritten “contract” with any AI company, AI scraping the Web appears to be nothing but theft.
That’s why I don’t block AI scrapers—and let thieves do thief things until our justice system(s) do justice system things.
(And yet, I may be wrong all over the place. I’ll be following the development just as you do, and perhaps make further adjustments depending on how it goes.)
About Me
I’m Jens (long: Jens Oliver Meiert), and I’m a frontend engineering leader and tech author/publisher. I’ve worked as a technical lead for companies like Google and as an engineering manager for companies like Miro, I’m somewhat close to W3C and WHATWG, and I write and review books for O’Reilly and Frontend Dogma.
I love trying things, not only in web development (and engineering management), but also in other areas like philosophy. Here on meiert.com I share some of my views and experiences.
If you’d like to do me a favor, interpret charitably (I speak three languages, and they do collide), yet be critical and give feedback, so that I can make improvements. Thank you!
Read More
Maybe of interest to you, too:
- Next: Imposing on Hearing
- Previous: We Always Knew Anyone Could Take Our Content
- More under Development or Everything Else
- More from 2024
- Most popular posts
Looking for a way to comment? Comments have been disabled, unfortunately.
Get a good look at web development? Try WebGlossary.info—and The Web Development Glossary 3K (2023). With explanations and definitions for thousands of terms of web development, web design, and related fields, building on Wikipedia as well as MDN Web Docs. Available at Apple Books, Kobo, Google Play Books, and Leanpub.