Design of an Async Coding Agent

Core Concept

A conventional coding agent is synchronous: it issues a shell command, blocks until it finishes, reads the output, then decides what to do next. This serialises work that is inherently parallel — builds, tests, installs, and downloads can all run concurrently.

An asynchronous coding agent decouples command dispatch from command observation. The LLM fires shell commands without waiting for them, accumulates in-flight work, and is notified when each one finishes. It can start a build, kick off a test suite, and begin reading documentation simultaneously, then reason over results as they arrive — in whatever order they complete.

Architecture

Non-blocking tool: `execute_shell_command`

The central primitive. When invoked, it:

Spawns the process in the background (via a subprocess with a FIFO for stdin).
Returns immediately with a tool_exec_id and paths to live stdout/stderr files.
The LLM can read those files mid-run for partial output, or simply proceed with other work.

Slow command (still running) — tool result:

{
  "tool_exec_id": "exec-a3f7",
  "started_at": "2026-06-24T14:02:11",
  "cwd": "/home/martin/myproject",
  "stdin_localfile":  "/tmp/agentknit/exec-a3f7/stdin",
  "stdout_localfile": "/tmp/agentknit/exec-a3f7/stdout",
  "stderr_localfile": "/tmp/agentknit/exec-a3f7/stderr"
}

The LLM moves on. Later, the REPL injects a completion notification:

Background command exec-a3f7 finished after 18.4s (returncode=0).
cwd: /home/martin/myproject
stdout: (see /tmp/agentknit/exec-a3f7/stdout)
last 3 lines:
----------------------------------------------------------------------
Ran 42 tests in 17.9s
OK

Fast command (finished within threshold, small output) — tool result:

{
  "tool_exec_id": "exec-b81c",
  "started_at": "2026-06-24T14:02:15",
  "cwd": "/home/martin/myproject",
  "stdin_localfile":  "/tmp/agentknit/exec-b81c/stdin",
  "stdout_localfile": "/tmp/agentknit/exec-b81c/stdout",
  "stderr_localfile": "/tmp/agentknit/exec-b81c/stderr",
  "completed": true,
  "returncode": 0,
  "duration_time": 0.043,
  "stdout": "main.py\nutils.py\nREADME.md\n",
  "stderr": ""
}

Fast commands inline their output so the LLM does not need to spend an additional tool call reading results. No completion event is pushed for them — output is already present in the tool result. Slow commands remain background tasks.

An optional when parameter delays the start by N minutes, enabling deferred scheduling (e.g. run after 10 minutes).

Completion queue

A shared queue (async_completion_queue) receives an event when each background process exits. The event carries return code, duration, working directory, and file paths for stdout/stderr.

Completion queue event (dict pushed when a background process exits):

{
  "tool_exec_id": "exec-a3f7",
  "returncode": 0,
  "stdout_file": "/tmp/agentknit/exec-a3f7/stdout",
  "stderr_file": "/tmp/agentknit/exec-a3f7/stderr",
  "duration": 18.403,
  "cwd": "/home/martin/myproject"
}

The REPL’s background monitor thread polls this queue continuously. When an event arrives, it sets an alert flag without blocking the main thread.

REPL event loop

The interactive loop checks the alert flag on every iteration:

Completions pending: format the completion event into a natural-language notification and inject it as the next LLM turn. The LLM sees "Background command X finished after Ns (rc=0). stdout: ..." and reacts — checking results, deciding what to run next, or continuing other work.
No completions, user typed input: run a normal LLM turn.
Completion arrives between prompt and Enter: the typed input is re-queued behind the completion, so the LLM processes the completion first and then the user’s message.

This makes completions first-class events that trigger LLM reasoning, not just side effects polled on request.

File tools

Alongside execute_shell_command, the agent has read_file, write_file, and edit (surgical substring replacement). These are synchronous and inline — they complete before the LLM continues, because file I/O is cheap and the result is needed immediately to produce the next action.

Design principles

Async by default, sync when trivial. Fast commands inline; slow commands background. The LLM does not need to distinguish them in its prompt — the infrastructure handles the threshold transparently.

Push, don’t poll. The LLM is notified when work finishes rather than being asked to check on it. This eliminates empty polling turns and keeps the context lean.

Natural-language completion events. Finished commands are delivered as prose ("Background command X finished…") rather than structured data. This is the LLM’s native interface and requires no special handling.

Resumable sessions. Long multi-hour tasks survive process restarts because the conversation history is snapshotted and the session ID is stable across invocations from the same directory.