I need Blender to do what I say. That's the problem I'm solving.

Normally, making a 3D scene means clicking around for hours. Set up objects. Position the camera. Tweak the lights. Render. Hate it. Move the light two inches. Render again. Hate it slightly less. Repeat for eternity.

I want to describe what I want and have it appear. So I'm connecting Claude to Blender.

Does this sound insane? It felt insane when I started.

How It Works

The basic idea:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Orchestration Layer                       β”‚
β”‚              (Claude Code / Task Management)                 β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                              β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”              β”‚
β”‚  β”‚  Scene   │───▢│ Blender  │───▢│  Render  β”‚              β”‚
β”‚  β”‚  Prompt  β”‚    β”‚  Script  β”‚    β”‚  Output  β”‚              β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜              β”‚
β”‚       β–²                               β”‚                     β”‚
β”‚       β”‚         Validation            β”‚                     β”‚
β”‚       └───────── Loop β—€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                     β”‚
β”‚                                                              β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                    Blender (Headless)                        β”‚
β”‚                  Python API / bpy module                     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

I describe a scene. Claude writes Python code. Blender runs it headless (no GUI). I get a render. If it's wrong, Claude looks at the image and tries again.

That's the theory anyway. Reality is messier.

What I've Built So Far

Two scripts to test if this actually works:

poc_create_scene.py β€” Takes a description, generates Blender Python code, creates the scene, spits out a render.

poc_validation_loop.py β€” Renders, checks if it looks right, fixes what's wrong, repeats until it's decent. Or until I give up.

What's Hard

Vague words don't translate to code. "Make it more dramatic" means nothing to Blender. I'm trying to build a vocabulary that bridges fuzzy creative direction and specific operations. It's slow going. Sometimes I spend an hour trying to explain what "moody" means in terms of light falloff and color temperature.

Blender has tons of state. The AI needs to know what's already in the scene before it can change anything. Getting that context right is tricky. Sometimes it adds a second camera instead of moving the first one. Sometimes it deletes everything and starts over. Neither is ideal.

Rendering takes forever. Even a simple scene is several seconds. A complex one? Minutes. The feedback loop only works if you can iterate fast. I'm using preview renders and simpler geometry to speed things up. Still not fast enough.

How do you know when it's good enough? I don't have a great answer yet. Right now it's mostly me looking at it and deciding. Which kind of defeats the purpose of automation. But I don't have a better idea.

What's Actually Working

Some stuff does work:

Basic scene setup from descriptions β€” works pretty well. "Put a table in the center with two chairs" actually works. Camera positioning from shot descriptions β€” surprisingly good. "Medium shot from the character's left" lands it close enough. Simple lighting from mood descriptions β€” decent. "Warm evening light from the window" gets you 70% of the way there.

The feedback loop improving results over passes β€” yes, it helps. But it's also slow and expensive.

What's Next

Hook up my actual character and environment assets. Figure out animation (right now it's just static scenes). Better quality checking. Running multiple shots at once.

Also: figure out if this is actually faster than just doing it manually. The jury's still out.

Stuff I Don't Know Yet

How much cinematography knowledge should I bake in? How specific do scene descriptions need to be? What do I do when the creative direction is intentionally vague? When do I just give up and do it manually?

Still figuring it out. I'll post more when I learn more. Or when I give up and admit this was a terrible idea.