ENVO HQ
Command Center
Thursday, April 2
0
Projects
0
Agents
βœ—
Gateway
0%
Live
10:13 PM
ENVO HQ
← Docs
hq/agents/agt_hq/2026-02-26/ship-ready-run_6f9a966b

Ship Ready Run 6f9a966b

Updated: 2/26/2026, 3:42:37 PM

Ops/Admin hygiene sweep (SSOT alignment)

Problem

run_queue.json can accumulate items stuck in status="running" (e.g., child session lost, crash, dispatcher restart). This blocks visibility and makes β€œwhat’s actually running?” untrustworthy.

Statuses (keep existing; add one new optional)

  • queued β†’ eligible to dispatch
  • running β†’ actively executing (must have run.startedAtMs)
  • done / failed / canceled β†’ terminal
  • (optional) stale β†’ non-terminal but not executing; requires manual triage

Stale definition

An item is stale-running if:

  • status=="running" AND run.startedAtMs exists AND
  • nowMs - run.startedAtMs > staleAfterMs (default 6h for short runs, 24h for long) AND
  • no recent heartbeat/update recorded (see run.lastHeartbeatAtMs below)

Suggested fields (backwards-compatible)

Add under item.run (optional fields):

{
  "lastHeartbeatAtMs": 0,
  "staleAfterMs": 21600000,
  "attempt": 1,
  "dispatcherRunId": "disp_...",
  "notes": "optional runtime notes"
}

Add at item top-level (optional):

{
  "triage": {
    "status": "needs_review",
    "reason": "stale_running",
    "flaggedAtMs": 0
  }
}

Safe repair procedure (NO deletes)

  1. Snapshot backup: copy run_queue.json to run_queue.backup.<timestamp>.json.
  2. For each status=running item older than threshold:
    • If run.childSessionKey exists and you can confirm it’s still active β†’ keep running, update run.lastHeartbeatAtMs.
    • Else mark as failed (preferred) or stale (if you want explicit triage):
      • status: "failed"
      • run.finishedAtMs: nowMs
      • run.error: "Marked failed by dispatcher: stale running (no active child session / exceeded threshold)."
      • triage: {status:"needs_review", reason:"stale_running", flaggedAtMs: nowMs}
  3. Optionally requeue by creating a new item with same title/notes/taskIds and status:"queued" (do not mutate history).

Dispatcher behavior (minimal code/data policy)

  • Dispatcher only dispatches queued items (keep current rule).
  • On each cycle, it can log only stale-running items, OR (safer) auto-mark them failed after threshold.
  • Never overwrite an existing terminal status.

Notes on your current data

Your queue already contains multiple running items with very old startedAtMs (2026-02-26 and earlier). The above policy lets you cleanly terminate them without deleting history, and optionally requeue fresh runs.


RECEIPT: runId: run_6f9a966b artifact: /Users/ENVOAI/.openclaw/workspace-theo/second-brain/brain/hq/agents/agt_hq/2026-02-26/ship-ready-run_6f9a966b.md

Files are read from second-brain/brain/ on your machine.