Why I Donāt Block AI Scrapers
Published on AugĀ 29, 2024 (updated SepĀ 25, 2024), filed under development, misc (feed). (Share this on Mastodon orĀ Bluesky?)
The basic contract of the Web seems to have been called, with AI scrapers taking content from everywhere to train their models, regardless of content licenses and preferences, without attribution or compensation.
For an increasing number of site and content owners, this has since meant to block AI scrapers. (For that purpose, there are also increasingly better helpers, like Dark Visitors.)
I, for my part, running sites like meiert.com, Frontend Dogma, and WebGlossary.info, have first tried but ultimately stopped excluding and blocking AI scrapers.
With scrapers starting off ignoring robots.txt directives, and us keeping on seeing existing and new scrapers that ignore robots.txt preferences, the approach isnāt only not working wellāAI companies have probably changed the game for good.
Personally, Iām not going to engage in an arms race in which more and more scrapers are being tried to be blocked. I rather watch this unfold legally.
Just like on your websites, the content on my websites is under specific licenses. While usually generous, some require attribution, and others specifically cover derivative use. Still, even where thereās no license specified, itās not anyone elseās content.
So what Iām betting on instead, is more legal actionāby other businesses, and other corporate interestsāagainst what looks like theft.
Will this take a long time to have an effect? Very likely so.
Could this mean the respective work will never get attributed, and their ownerāhere Iānever be compensated for it? That seems likely, too.
Will one even be able to join any cases, to invoke oneās rights? Given how we think about law in Europe (with no few class actions), probably not even that.
Still, letās face it: If anyone walks around and copies content, to reuse it and resell itāthen thatās theft regardless of whether you had put up a sign, āno stealing, please.ā And as there hasnāt even been an unwritten ācontractā with any AI company, AI scraping the Web appears to be nothing but theft.
Thatās why I donāt block AI scrapersāand let thieves do thief things until our justice system(s) do justice system things.
(And yet, I may be wrong all over the place. Iāll be following the development just as you do, and perhaps make further adjustments depending on how it goes.)
About Me
Iām Jens (long: Jens Oliver Meiert), and Iām a web developer, manager, and author. Iāve been working as a technical lead and engineering manager for companies youāve never heard of and companies you use every day, Iām an occasional contributor to web standards (like HTML, CSS, WCAG), and I write and review books for OāReilly and Frontend Dogma.
I love trying things, not only in web development and engineering management, but also in other areas like philosophy. Here on meiert.com I share some of my experiences and views. (I value you being critical, interpreting charitably, and giving feedback.)