Agent UX: window/tab confusion + doctrine delivery

BlitzOS · findings & proposals from two agent self-tests

2026-06-24 · branch blitz-v1 · sources: Agent 5 (workflow + window typing) and Agent 3 (“log in to reddit”)

root cause

Both headline failures, the agent treating a window connection like a browsable tab, and the agent asking for a login in prose instead of a handoff card, come from one thing: the operating doctrine is delivered to a spawned agent as a thin pointer it must fetch on demand, not inlined. The doctrine already contains both rules. The agent simply never had it in context.

1 · Context

Agent 5 — workflow + window typing

Replicated, fixed, and field-verified in a packaged VM build (real TCC).

ItemStatus
spawn claude ENOENT (workflows)fixed
cg_key keys + modifiers (End, cmd+End)fixed
action:'paste'fixed
connection_reveal for windowsfixed
screenshot (was empty / TCC)fixed
helper redeploy (CFBundleVersion bump)fixed

Agent 3 — “log in to reddit”

A cascade: only a Safari window connection was available (a Google Doc), no Chrome in the VM. The agent tried to browse with it, navigate, keystroke a URL, then open via shell, which spawned a disconnected window. From there every screenshot was the wrong page, and it asked for credentials in prose instead of a handoff card.

This is, in effect, the handoff-card test that was flagged untested. It surfaced the discoverability and window-state-mismatch problems below.

2 · The shared root cause

A spawned agent's bootstrap inlines identity, the relay address, a couple of hard rules (web, progress), the duty, and the event loop. The full doctrine is a pointer:

bootstrap.txt
identity · relay · web/progress rules · duty · /events loop
“thin pointer”
“guide is at $B/agents.md … fetch it only afterward
agents.md (lazy)
handoff rule + window-vs-tab rule live here, only if curl'd
// agent-runtime.mjs:7   "the served blitzos-agents.md is the source of truth; this is a thin pointer"
// guide fragment (duty agent):
//   "fetch the guide (curl …/agents.md) only afterward … Do not let reading the guide delay your first action."

The doctrine already answers both failures. agents.md:60: “Connecting a browser as a window gives only the toolbar AX tree, not the page, so use a tab for the web.” The handoff rule (“never prose, call request_handoff”) is in the same guide. The agent never fetched it.

3 · Cause A — window vs tab proposal

Finding

Proposal

4 · Cause B — handoff & doctrine delivery proposal

Finding

Proposal — inline the full doctrine into the bootstrap

bootstrap.txt
identity · relay · full agents.md (read from the live file) · duty · /events loop
every agent, turn 1
handoff rule + window/tab + connection routing all in context

Trade-off: the bootstrap comment says it “stays a thin pointer” on purpose (small bootstrap, single source). I'd flip that deliberately — reading the live file keeps single-source-of-truth and the size cost is nothing.

5 · Deferred — verify in code first later

ItemWhy deferred
D7 · screenshot staleness (identical MD5)could be a real cache, or the connected window genuinely never changed (Reddit was in a different window). Check the capture path before adding fresh:true.
D8 · premature handoff resolve“tab navigated” fires on any URL change incl. the initial load. Gate on “navigated off the login route / settled.”
connection_act silent successhelper can't always know the semantic outcome of a synthetic key; richer effect is heuristic.
AX-vs-screenshot mismatch warning · claim-next-windowdeeper heuristics / a new feature; scope separately.

6 · Recommended order

Generated for review. All findings grounded in src/main/agent-runtime.mjs, blitzos-agents.md, connection-ops.mjs, and os-tools.mjs. The Agent 5 fixes are committed and field-verified; the Cause A/B items are proposals pending a go-ahead.