Coding Agent Concept Cars

Concept cars are prototypes built by automakers not to sell, but to explore design directions — to ask “what if?” without committing to mass production. Deep into agentic psychosis, I’ve been exploring and prototyping coding agent concepts: each one pushes one idea to its logical extreme, for fun and profit.

The common substrate is agentknit, a Python library for building tool-calling agents.

The Meta-Circular Agent

Alpine Imprint RLS Concept car — Alpine Imprint RLS Concept — flickr.com (foshie), CC BY 2.0

A meta-circular evaluator is a program that can interpret its own source code — the classic example is a Lisp interpreter written in Lisp. The idea matters because it collapses the boundary between the tool and the object being manipulated. If a system can describe and rebuild itself, it is self-bootstrapping.

The specification for this agent is intentionally minimal: it defines a coding agent as something that reads files, writes files, and executes shell commands under the direction of an LLM. That is all. I gave this spec to a coding agent and asked it to produce a working implementation. It returned agent.py — a functional coding agent built from the description alone.

Then I gave that agent the same spec and asked it to reimplement itself. It succeeded. The second-generation agent is syntactically different from the first, but functionally equivalent. The loop closes: a coding agent that can read a specification and emit its own source is a meta-circular coding agent. Paper here.

Why care? Because it tests whether the abstraction of “agent” is rich enough to be its own implementation language. If the spec is sufficient, the concept is coherent.

The Seed Agent

Jay Leno's EcoJet concept car — Jay Leno’s EcoJet concept — flickr.com (Alden Jewell), CC BY 2.0

Most coding agents ship with a fixed toolbox: read_file, write_file, execute_shell, maybe a web search. The seed agent starts with exactly one tool: create_tool. On turn one it cannot read, write, or execute anything. It can only write Python functions and register them as new tools in its own live session.

Given the task “explore the agentknit package and count its public functions,” the agent does the following:

Creates find_module_path to locate the package on disk.
Creates list_directory to see what files are inside.
Creates count_file_lines to measure the code.
Only then does it perform the actual count.

See the full trace.

What this shows: tool creation is itself a tool. Give an agent one meta-operation — define new operations — and it can bootstrap any capability it needs, just as a universal Turing machine can simulate any other machine from a minimal instruction set. The seed agent cannot read files until it builds a reader; it cannot explore directories until it builds an explorer. It must think about what it needs to know before it acts, rather than relying on a pre-chewed interface.

The Remote Control Agent (June 2026)

Citroën concept car — flickr.com (Supermac1961), CC BY 2.0

Coding agents usually run on the same machine that hosts the model and the orchestration logic. The RC Agent splits them apart. Its three tools — read_file, write_file, execute_shell — are thin wrappers around SSH commands. Every tool call opens a connection to a remote host, runs the operation there, and returns the result. The LLM has no knowledge that the filesystem it is manipulating is on another machine.

The design question: can you separate where the harness runs from where the code runs? The answer is yes, and the separation is clean. The agent is a controller; the remote machine is the plant. This matters for several reasons:

Security: The target machine can be sandboxed or air-gapped from the model provider.
Resource access: The agent can operate on machines with different architectures, GPUs, or private data without moving either.
Distribution: One agent can manage many hosts through the same interface.

The only coupling is three SSH wrappers. Everything else — the prompting, the reasoning, the tool schema — stays identical. Trace: beethoven is not sos-small02.

The Async Agent

Nissan 240Z concept car — Nissan 240Z concept — commons.wikimedia.org (Mercennarius), CC BY-SA 4.0

Standard coding agents use blocking tool calls: the model issues a command, waits for it to finish, and only then continues reasoning. This is simple but wasteful. If a command takes thirty seconds, the model sits idle. If two independent tasks could run in parallel, the agent still runs them sequentially.

The async agent replaces blocking I/O with non-blocking I/O at the tool layer. When the model calls execute_shell_command, the tool returns immediately with:

a tool_exec_id (a handle for the running process),
file paths where stdout and stderr are being streamed.

The model can now issue multiple commands in flight, check their status, read partial output, and interleave reasoning. It is the difference between synchronous and asynchronous programming, applied to agent tool calls.

The shift is subtle but deep. A synchronous agent is a script executor: it runs one command, waits, runs the next. An async agent is a scheduler: it must decide which tasks are independent, which are dependent, and when to poll for results. The model reasons about time and dependency explicitly, not implicitly. See Design of an Async Coding Agent.

The Second-Guess Agent

Pontiac G8 concept car — commons.wikimedia.org (Dima Sergiyenko), CC BY-SA 4.0

Every coding agent that can execute shell commands is a security risk. A mistaken rm -rf or a malicious dependency install can destroy data. The usual response is to restrict the toolset or wrap everything in a sandbox. The second-guess agent takes a different approach: observability with cancellation.

Before any shell command executes, the agent pauses for two seconds. During that window, a human operator — or a small supervisor LLM — can review the command and abort it with Ctrl-C. Two seconds is roughly the inference latency of a lightweight classifier that can label a command as safe, suspicious, or dangerous.

The design principle: every exec call is observable and cancellable before damage is done. This is not full sandboxing; it is a human-in-the-loop (or model-in-the-loop) gate. The insight: safety can be architectural — a temporal buffer — rather than purely a matter of permission lists.

The Slash Agent

Mercedes concept car — flickr.com (Neil), CC BY 2.0

Chat interfaces use slash commands — /model, /clear, /usage, /help — as operator controls. The human types them. The slash agent inverts this: it exposes the same commands as structured tool calls that the LLM can invoke on itself.

The model can therefore:

Switch its own underlying model mid-session (e.g., downgrade to a cheaper model for a simple subtask).
Check its own cumulative token usage and decide whether to continue.
Clear its own context window when it detects that the conversation has grown too long and is degrading.

The agent becomes its own session manager. This is an inversion of control: instead of the human managing the agent’s lifecycle, the agent manages itself. The test: can the model reason about its own resource constraints and make operational decisions? See the full trace.

The Browser-as-Runtime Agent

TomTom autonomous test vehicle — commons.wikimedia.org (Geoboer), CC BY-SA 4.0

A coding agent needs a runtime: a place to execute shell commands, run Python, store files. JsChat removes that dependency entirely by making the browser the runtime.

The architecture is zero-backend. The LLM API is called directly from JavaScript in the browser. Tools execute inside the browser sandbox: localStorage is the filesystem, fetch is the network layer, the JavaScript console is the shell, and the tab is the process. There is no server, no container, no SSH host — just a web page.

This collapses the deployment stack to a single static HTML file. It also changes the trust model: your code never leaves your machine, and the only network traffic is the LLM API calls you authorize. The browser is not the UI layer on top of a real runtime; it is the runtime. Try the live demo.

In the works

The AB testing agent, the minifying agent