The magic of ControlNet
So how does it work? We’ve covered Stable Diffusion frequently before. It’s a neural network model trained on millions of images scraped from the Internet. But the key here is ControlNet, which first appeared in a research paper titled “Adding Conditional Control to Text-to-Image Diffusion Models” by Lvmin Zhang, Anyi Rao, and Maneesh Agrawala in February 2023, and quickly became popular in the Stable Diffusion community.
Typically, a Stable Diffusion image is created using a text prompt (called text2image) or an image prompt (img2img). ControlNet introduces additional guidance that can take the form of extracted information from a source image, including pose detection, depth mapping, normal mapping, edge detection, and much more. Using ControlNet, someone generating AI artwork can much more closely replicate the shape or pose of a subject in an image.
Ugleh
The spiral pattern used to guide ControlNet to create the medieval village.
Ugleh
The spiral pattern used to guide ControlNet to create the medieval village.
Ugleh
Using ControlNet and similar prompts, it’s easy to replicate Ugleh’s work, and others have done so to amusing effect, including checkerboard anime characters, an animation, medieval village “goatse” (surprisingly safe for work), and a medieval village version of “Girl with a Pearl Earring.”
Despite the massive attention and many offers to turn the artwork into NFTs, Ugleh has chosen to keep a low profile for now. On X, he said, “I appreciate all the positive feedback toward AI art, I do not plan on making money from my latest generations, and I will not be doing any official interviews. I am just a normal tech-savvy AI nerd who experimented with a new ControlNet technique.”
If you want to experiment with ControlNet, this site has a good tutorial. Also, Ugleh posted a step-by-step workflow, including the spiral and checkerboard template files, on Imgur.
While the artwork is remarkable, current US copyright policy suggests that the images do not meet the standards to receive copyright protection, so they may be in the public domain. While AI-generated artwork is still a contentious subject for many on ethical and legal grounds, creative enthusiasts continue to push the boundaries of what is possible for an unskilled or untrained practitioner using these new tools.