Four Painters, One Brief

Somewhere in the last week or two “add an image to the post” quietly became “ask the one model you always ask, in the style you always ask for, and take what it gives you.” That’s how a site ends up wearing a uniform. I wanted the opposite: a room full of illustrators who don’t agree with each other, all handed the same one-line brief, and me at the end of the table picking the take that earns the wall.

So now every post here gets four concept images before it gets one.

The two axes

The trick isn’t generating four images. Any engine will hand you four shades of the same picture and call it variety. The trick is two axes, multiplied:

One brief, four hands. The point of the contact sheet is that it argues with itself.

A two-by-two contact sheet of the same scene in four styles: a waiter presents an endlessly unrolling restaurant bill to a robot diner, drawn as a gag cartoon, a dark ink wash, a pink-and-teal risograph, and an antique woodcut.
One brief, four hands: a robot gets the bill. Gag panel, ink wash, risograph, woodcut.

The brief is the work

The pipeline is a manifest file and a script the agent wrote in an afternoon, and neither is the interesting part. The interesting part is the brief: one sentence per post, written by hand, describing a single visual idea. Not “an illustration about AI and labor.” A picture you could describe over the phone: a gleaming machine dispensing an endless conveyor of identical mops and buckets, while a person walks away carrying a paintbrush and easel.

Writing thirty of those in a sitting turns out to be the actual creative labor of the whole system, and it’s exactly the half I’d never hand off. The typing, the API juggling, the retry logic, the contact sheets: all delegated. The one-sentence idea of what the picture is: mine, every time. If you take one thing from this post, take the ratio.

The mechanics, briefly

For the record, the moving parts, all built and driven by the agent while I did other things:

The two rules that matter

Everything above is plumbing. These two decisions are why the results are usable:

  1. No text baked into the images. Image models garble lettering, and a caption fused into pixels can’t be edited, translated, or resized. Every caption on this site is real HTML sitting under (or on) the image, which is also what makes the layouts possible: the framed panel with the italic caption below it, and the variant that sets the caption inside the artwork’s empty sky. The white space is part of the brief.
  2. Concepts are not content. The four candidates live outside the published site, in a folder the build can’t see, with a notes file recording prompt, engine, style, timing, and verdict. Only a winner gets processed (compressed, stripped) into the post, and only with a deliberate layout treatment. A bare image dropped mid-paragraph is how you end up with a uniform again, just a prettier one.

What I’d tell you to steal

Not the script; scripts are an afternoon. Steal the shape: hand-write the one-sentence visual idea, force real variety with engines and styles, review on a contact sheet so the takes compete, keep the rejects, and never let text into the pixels. The machine is a room full of painters who work for pocket change and never sleep. The taste that picks the winner doesn’t come with the room.