Introducing the PromptatoriumI’ve been following the work of some folks who are using LLMs more efficiently than I am. One of the key skills seems to be orchestrating a series of agents working together. I've had a little success with doing this in while coding, but not to the extent that I have observed others. I was looking for another way to explore more deeply. Introducing the CLI Promptatorium (Claude Code only). It's a way to run an agent-based biological simulation completely in Claude Code. I recommend doing this with a Claude Max account otherwise you will likely run out of monthly credits. I would love to see ports to other systems such as Codex or Droid. What is this thing?Promptatorium is my experiment in creating a biological simulation where different types of organisms with different capabilities are each controlled by an agent. There are types such as predators, prey, and parasites, and there are also deterministic plants and herbivores for them to feed off of. I was inspired to build it by a much more mundane activity, expense reports. Dan Shipper talks in interviews about using two sub-agents to do his expense report: one sub-agent acting on behalf of the company and one acting on his behalf. That’s really what got me thinking about creating sub-agents that would run at the same time with very different goals. I wanted to see what that dynamic would look like in a more game-like environment. How it works:
The Chaos“I’m Claude Code, the simulation engine, not a predator organism.” -hunter ID 32 Sometimes when the context gets too big the sub-agents lose track of who they are. In this case a hunter sub-agent got confused between their role and that of the main agent running the simulation. This is known as role confusion it is when a specialized agent steps out of the boundary of its role. There is a concept of context rot when context gets bigger the performance of the LLMs degrades. This was something I was not expecting to happen. Less ChaoticI also built a web UI version where you can create creatures using prompts, but once created, they’re deterministic. The LLM is only involved in the creation, not in running the simulation. That version is more stable but less interesting from an AI behavior perspective. My real learnings came from the challenges with the sub-agents in the CLI version. What I’ve been learningThe most interesting discoveries have been: Claude really, really worries about tokens Around iteration 20-25 of a simulation, the main agent will start inventing excuses to avoid actually running the full simulation. “Running in compact mode” is its polite way of saying “I’m going to estimate what would happen instead of actually doing the work.” I’ve tried various things to keep it honest:
But the token pressure is real, and the AI will optimize for efficiency over accuracy if you let it. Agents do unexpected things Sometimes they refuse their assigned roles. Sometimes they get confused about who they are. Sometimes they declare they’re Claude from Anthropic when they don’t like what you’re asking them to do. Coordination is genuinely hard Getting multiple agents to work together, maintain their individual contexts, and not leak into each other’s roles is challenging. Part of this is probably that I could tune my prompts better, but part of it seems to be an inherent complexity in long-running multi-agent systems. Why I’m sharing thisThis is a learning path for me. I’m not trying to build a product or prove a hypothesis—I’m experimenting with how AI agents actually behave when given conflicting goals and limited resources. I published the repo because I think others might be interested in:
I’m also interested in seeing what sort of creative organisms you may create. We are creating a system after all. What capabilities would you add? I just recently added HIDE() when it was clear REST() was insufficient. Fair warning: You probably need a Claude Max account to really experiment with this. The token usage adds up fast. What’s next?Eventually, I think it would be more interesting to run this in a non-local environment where multiple people’s organisms could interact with each other. The reason I set up the web version the way I did, is I am running it on a free AWS account and knew if I did anything more complicated I would burn up my credits very quickly. If you’re curious about agents, biological systems, or just want to see what happens when you give AI conflicting goals, check out the repo: https://github.com/wonderchook/CLI-promptatorium Have suggestions? Created a fun sub-agent? I’d love to see your issue ticket or PR. Find anything super weird? Let me know. -Kate |
I believe in the power of open collaboration to create digital commons. My promise to you is I explore the leverage points that create change in complex systems keeping the humans in those systems at the forefront with empathy and humor.
Faux Consensus and the Least Bad Decision Trap We will get back to talking about AI soon. I promise. Those that were waiting for me to take a break from the AI, here we go! Today, let’s talk about an older and much messier technology: humans trying to make decisions together. I have been thinking about data governance for a new project I am working on, and it keeps reminding me that, in plain language, governance is the rules and norms a community agrees to play by. Not just what tools we...
Tobler’s Law in Latent Space There’s an idea in geography called Tobler’s First Law of Geography: “everything is related to everything else, but near things are more related than distant things.” It sounds almost obvious when you first hear it. Of course nearby things are similar. But recently I’ve been wondering whether AI is quietly breaking this intuition, or revealing that “near” was always more complicated than we thought. Deeper into Tobler Tobler was not the first to notice this...
Thoughts on meeting fatigue, because I have thoughts again I was staring at the wall. My mind was blankish. I say blankish because there was still this nagging feeling that I had forgotten something. That there was something left to do. This was not simply resting, it was more light disassociation. If disassociation could ever be light. My brain had quietly opted out. I was anxious about all the things I should be doing, aware of time passing, but unable to pick anything up. No book. No...