BlitzOS · findings & proposals from two agent self-tests
Both headline failures, the agent treating a window connection like a browsable tab, and the agent asking for a login in prose instead of a handoff card, come from one thing: the operating doctrine is delivered to a spawned agent as a thin pointer it must fetch on demand, not inlined. The doctrine already contains both rules. The agent simply never had it in context.
Replicated, fixed, and field-verified in a packaged VM build (real TCC).
| Item | Status |
|---|---|
spawn claude ENOENT (workflows) | fixed |
cg_key keys + modifiers (End, cmd+End) | fixed |
action:'paste' | fixed |
connection_reveal for windows | fixed |
| screenshot (was empty / TCC) | fixed |
| helper redeploy (CFBundleVersion bump) | fixed |
A cascade: only a Safari window connection was available (a Google Doc), no Chrome in the VM. The agent tried to browse with it, navigate, keystroke a URL, then open via shell, which spawned a disconnected window. From there every screenshot was the wrong page, and it asked for credentials in prose instead of a handoff card.
This is, in effect, the handoff-card test that was flagged untested. It surfaced the discoverability and window-state-mismatch problems below.
A spawned agent's bootstrap inlines identity, the relay address, a couple of hard rules (web, progress), the duty, and the event loop. The full doctrine is a pointer:
$B/agents.md … fetch it only afterward”// agent-runtime.mjs:7 "the served blitzos-agents.md is the source of truth; this is a thin pointer" // guide fragment (duty agent): // "fetch the guide (curl …/agents.md) only afterward … Do not let reading the guide delay your first action."
The doctrine already answers both failures. agents.md:60: “Connecting a browser as a window gives only the toolbar AX tree, not the page, so use a tab for the web.” The handoff rule (“never prose, call request_handoff”) is in the same guide. The agent never fetched it.
navigate or run_js. Web work needs a tab (Blitz Chrome, or a connected browser tab).connection_list does return type and capabilities per connection, but nothing reads as “this can't be browsed,” and connection_navigate (documented TAB-only) errors on a window with a bare verb "navigate" is not supported.open → a second, unconnected window → wrong screenshots for the rest of the session.connection_navigate / run_js on a window return a useful error: “this is a window connection (a native app), not a browsable tab — to open a URL use Blitz Chrome (blitz_chrome_open) or ask the user to open it in their browser and connect that tab.”request_handoff instead.”agents.md, so it wasn't in context when the login wall appeared.blitzos-agents.md that's served at /agents.md, at bootstrap-build time → no copy, no drift, single source of truth preserved.$B/agents.md only if you want a fresh copy.”connection_read/act descriptions too.Trade-off: the bootstrap comment says it “stays a thin pointer” on purpose (small bootstrap, single source). I'd flip that deliberately — reading the live file keeps single-source-of-truth and the size cost is nothing.
| Item | Why deferred |
|---|---|
| D7 · screenshot staleness (identical MD5) | could be a real cache, or the connected window genuinely never changed (Reddit was in a different window). Check the capture path before adding fresh:true. |
| D8 · premature handoff resolve | “tab navigated” fires on any URL change incl. the initial load. Gate on “navigated off the login route / settled.” |
| connection_act silent success | helper can't always know the semantic outcome of a synthetic key; richer effect is heuristic. |
| AX-vs-screenshot mismatch warning · claim-next-window | deeper heuristics / a new feature; scope separately. |
navigate/run_js error + the doctrine “no browsable source” path. proposalGenerated for review. All findings grounded in src/main/agent-runtime.mjs, blitzos-agents.md, connection-ops.mjs, and os-tools.mjs. The Agent 5 fixes are committed and field-verified; the Cause A/B items are proposals pending a go-ahead.