poisoning.ai
Art

How to stop AI from scraping my art

By The Poisoning.ai team
5 min read
Contents

You cannot fully stop a determined scraper from downloading a public image, but you can make the crawl collect less and make whatever it collects useless for training. That splits into three layers: control what crawlers can reach with robots.txt, opt-outs and smaller uploads; degrade the payload with a style cloak like Glaze or a poison like Nightshade; and block downstream editing with PhotoGuard or Anti-DreamBooth. None of these is a wall. Each one raises the cost of turning your art into training data.

Can you actually stop AI from scraping your art?

Not completely. If an image is public and a human can see it, a bot can fetch it, so “stop AI scraping” in practice means two things: reduce what gets collected, and degrade what does. That is a narrower goal than the full training-protection checklist, which is about stopping a model from learning your style once it already holds your work. Here the focus is the collection step itself, the crawl that pulls your images into a dataset in the first place, and the starting point is that you are shaping the odds, not locking a door.

Control what crawlers can collect

Start with the passive layer, because it costs nothing and does not depend on any arms race holding.

ControlWhat it doesLimit
robots.txt disallowBlocks compliant AI crawlers on your own siteVoluntary; bad actors ignore it
Do-Not-Train opt-outDe-lists you from participating datasetsCoverage varies; closed models unaffected
Lower-resolution uploadsLess signal to learn fromReduces quality on display too

A robots.txt file on your own site can disallow the major AI crawler user agents, such as GPTBot (OpenAI), Google-Extended (Google) and CCBot (Common Crawl). The limit is that robots.txt is voluntary: compliant crawlers respect it and bad actors ignore it, so treat it as a floor rather than a lock. Beyond your own site, many platforms and Do-Not-Train registries now offer an opt-out; the mechanics of registering and what each registry actually covers are walked through in anti-scrape data poisoning and opting out, so this article does not rebuild them. Two habits round out the layer: upload at a lower resolution so there is less signal to learn from, and never post your only high-resolution original, keeping a clean copy private as proof of authorship if a dispute ever arises.

Make what gets scraped useless to train on

For anything that will be scraped anyway, the goal shifts to spoiling the payload. A style cloak is the direct option: Glaze (Shan, Cryan, Wenger, Zheng, Hanocka, Zhao, USENIX Security 2023) adds perturbations that, in the authors’ words, “apply barely perceptible perturbations to images, and when used as training data, mislead generative models that try to mimic a specific artist.” Mist (Liang and Wu, 2023) is an open-source cloak in the same family, using a larger perturbation budget of 17/255. To push back at the scraper rather than shield a single image, Nightshade (Shan, Ding, Passananti, Wu, Zheng, Zhao, IEEE S&P 2024) poisons the concept a model learns, and its authors report that “less than 100 poisoned training samples” can control a single SDXL prompt. The full routine for stacking these is in how to protect your art from AI training; the difference between the two tools is in Glaze and Nightshade, explained.

Block editing and face cloning after the scrape

Scraping is not the only way a collected image gets misused. If your worry is someone editing your picture or cloning a face from it, two tools target that step. PhotoGuard (Salman, Khaddaj, Leclerc, Ilyas, Mądry, ICML 2023) immunizes a photo against AI image-to-image and inpainting edits, so a diffusion model that tries to alter the protected picture produces a visibly broken result. Anti-DreamBooth (Van Le, Phung, Nguyen, Dao, Tran, Tran, ICCV 2023) targets personalization fine-tunes; the method “aims to add subtle noise perturbation to each user’s image before publishing in order to disrupt the generation quality of any DreamBooth model trained on these perturbed images.” Use these when the threat is a specific edit or a face clone, not indiscriminate training.

Why scraping defences are deterrents, not walls

Every technical layer here has a published weakness, which is why they are worth stacking but not worth trusting alone. Hönig, Rando, Carlini and Tramèr (ICLR 2025) found that first-generation cloaks “create a false sense of security and leave artists vulnerable to style mimicry,” with a best-of-four removal attack pushing copies of Glaze-protected art to a 56.6% quality preference, where 50% means the copy is indistinguishable from one trained on unprotected work. The poison side is no longer exempt either: LightShed (Foerster, Behrouzi, Rieger, Jadliwala, Sadeghi, USENIX Security 2025) is “a generalizable depoisoning attack that effectively identifies poisoned images and removes adversarial perturbations,” reporting a 99.98% true-positive rate at detecting Nightshade before removing it. The practical reading is in does Glaze actually work? and can Glaze and Nightshade be bypassed?. Assume anything you upload can be transformed before it is ever used to train a model.

Stopping AI from scraping your art is really about controlling the crawl and spoiling the payload, then accepting that neither is permanent. Block the compliant crawlers, opt out where you can, upload less than your best copy, cloak and poison what will be taken anyway, and block editing where a specific image matters. A scraper then has to defeat every layer, while you only need to make collection cost more than the work is worth. For the reliability verdict on the tools themselves, see the AI art-protection scorecard.

Sources

  • Shan, Cryan, Wenger, Zheng, Hanocka, Zhao (2023). GLAZE: Protecting Artists from Style Mimicry by Text-to-Image Models. USENIX Security 2023.
  • Liang, Wu (2023). Mist: Towards Improved Adversarial Examples for Diffusion Models.
  • Shan, Ding, Passananti, Wu, Zheng, Zhao (2024). Nightshade: Prompt-Specific Poisoning Attacks on Text-to-Image Generative Models. IEEE S&P 2024.
  • Salman, Khaddaj, Leclerc, Ilyas, Mądry (2023). Raising the Cost of Malicious AI-Powered Image Editing (PhotoGuard). ICML 2023.
  • Van Le, Phung, Nguyen, Dao, Tran, Tran (2023). Anti-DreamBooth: Protecting Users from Personalized Text-to-Image Synthesis. ICCV 2023.
  • Hönig, Rando, Carlini, Tramèr (2025). Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI. ICLR 2025.
  • Foerster, Behrouzi, Rieger, Jadliwala, Sadeghi (2025). LightShed: Defeating Perturbation-based Image Copyright Protections. USENIX Security 2025.
#art-protection#scraping#glaze#opt-out
Get new protection tests & guides

New protection tests, breakdowns and how-long-does-it-hold checks. No spam, unsubscribe anytime.