The tool sabotages machine learning models from the inside, causing them to break in unprecedented ways
To animals and humans alike, poisonous nightshade looks alluring, with its radiant blooms and brightly-colored berries. And to AI programs, images modified by a new tool called Nightshade look just like every other in their dataset. But unlike the countless artworks that taught image generators how to mirror the minutiae of paintings and photographs, images altered by Nightshade do the polar opposite: introduce inaccurate data into machine learning models that poison these programs from the inside, causing them to malfunction in unpredictable ways.
Nightshade does this by exploiting the vulnerabilities of popular AI programs like DALL-E, Midjourney, and Stable Diffusion, which are trained on massive datasets of images scraped from the open internet—often without their makers’ consent. The tool, developed by University of Chicago professor Ben Zhao and his collaborators, works by adding invisible pixels to images that, while undetectable to the human eye, influence the way they’re perceived by machine learning algorithms, disrupting the model’s ability to generate an accurate image in response to a text prompt. When researchers tested the attack on popular AI models, they found that, after Stable Diffusion was fed just 50 poisoned images of dogs, the output for “dogs” started reflecting creatures with extra limbs and distorted faces. If enough altered samples are introduced, requests for photos of cars result, instead, in pictures of cows; hats become cakes; handbags turn into toasters.
“These attacks on AI systems are possible because the mathematical representation deviates significantly from what humans perceive,” Zhao told Document earlier this year, explaining that Nightshade’s precursor—an image-cloaking technology called Glaze, which he created to defend artists’ work against stylistic mimicry—similarly works by leveraging the gap between what humans see, and the data machines use to interpret images. But while Glaze protects artists’ intellectual property by making their work harder to copy, Nightshade has a different function: attacking the image generators themselves. “It’s controversial. It’s provocative. But I think it’s necessary,” says Zhao. “Glaze is a band-aid. It doesn’t solve the problem of intellectual property theft at the hands of machine learning algorithms—it only mitigates the immediate damage.”
“Nightshade adds some teeth to current copyright protections, so when these companies violate the rights of artists, they can at least expect there to be something waiting on the other side.”
Zhao notes that while some companies claim to honor opt-out requests from those who wish to have their images excluded from future AI datasets, they’re not enforceable by law—meaning that companies can, more or less, do whatever they want. “These measures are all predicated on their conscience and their good behavior, and while some companies might respect these measures, there’s no way to verify it. There are very few incentives for them to abide by those mechanisms,” he explains. “We created Nightshade because right now, AI companies hold all the cards—and we need to tip the power balance back in favor of artists.”
By threatening the efficacy of popular machine learning models, Nightshade raises the stakes for the companies behind AI image generators, which have continued operating with impunity despite the numerous class action lawsuits making their way through the courts. “At some point, security is not about forcing a binary result—it’s about increasing the cost of one path, so that people take the other,” says Zhao. “Nightshade adds some teeth to current copyright protections, so when these companies violate the rights of artists, they can at least expect there to be something waiting on the other side that provides a real disincentive from doing that.”
“There’s a lot of risk for these companies, and that’s by design. At some point, security is not about forcing a binary result—it’s about increasing the cost of one path, so that people take the other.”
One might assume that, since image generators are trained on large datasets, an equally large number of poisoned samples would be required to sabotage them—but according to Zhao, Nightshade is “extremely potent,” with less than one hundred altered images needed to create a major impact. This is because the tainted samples produce a cumulative effect, not only corrupting the output for a single prompt but also for other related concepts. For example, if one were to introduce a few dozen images of dragons—modified by Nightshade so that AI interprets them as castles—the effects would bleed into categories like “fantasy art.” After many poison attacks, the image generator’s general features may also start to malfunction, undermining the trustworthiness of the program. This presents a significant challenge to AI companies: “Many will attempt to create countermeasures. They might slow down AI development, or limit the intake of data to ensure they’re only using images that haven’t been protected by Nightshade,” says Zhao. “There’s a lot of risk for these companies, and that’s by design.”
The first tool of its kind, Nightshade represents a sea change in the battle for intellectual property rights, with artists taking an offensive stance beyond simply defending their work. If enough people choose to apply Nightshade to their images and infect the massive databases on which AI models are trained, it could throw a prolific wrench in the gears of popular image generators—forcing companies like OpenAI to think twice before training their models on artists’ work without consent.
Glaze is currently available to the public, and when Zhao releases Nightshade, he plans to give users the option of whether they want to simply protect their art, or actively combat the ability of machine learning models to copy theirs. For him, it’s a project spurred by passion, not profit: “Every researcher in every field dreams of being able to make a real impact,” he says. “I’m not an artist by any stretch of the imagination—but human creativity is something we all strive for. So when artists began having their work stolen by image generators, I saw a grave injustice that needed to be fixed. Artists are in a vulnerable position because they often don’t have money or representation; they don’t have unions organizing to protect them like writers and actors do, and they don’t have lawyers like musicians do. Yet they are some of the most diverse and creative people in society—and they deserve to be helped.”