HQ Safety + Capabilities Plan: Per-Agent Tool Enforcement
Run ID: run_548665587a96
Agent: agt_shipwright (Forge)
Date: 2026-02-26
Status: Proposal - SHIP READY
Executive Summary
The HQ system already has foundational capability infrastructure in place (riskTier, allowedTools), but lacks runtime enforcement. This plan proposes minimal code changes to activate robust per-agent capabilities without breaking existing functionality.
Current State Assessment
β What's Already Built
- Agent Schema:
riskTier(SAFE/BUILDER/OPERATOR) +allowedTools[]inagents.json - UI Management: AgentsBoard.tsx has full CRUD interface for capabilities
- API Support:
/api/agentsand/api/gmc/agents/upserthandle capability updates - Dispatch Infrastructure:
dispatch_config.jsonroutes agents to OpenClaw instances
β Missing: Runtime Enforcement
- No validation of tool calls against
allowedToolsat execution time - No enforcement of capability restrictions during agent runs
- UI comment confirms: "Enforcement will be applied by the dispatcher (next step)"
Proposed Solution: 3-Layer Defense
Layer 1: Dispatch-Time Tool Filtering
File: src/app/api/ops/enqueue-run/route.ts
// Add before agent execution
function enforceAgentCapabilities(agentId: string, requestedTools: string[]): string[] {
const agents = readAgentsConfig();
const agent = agents.agents?.find(a => a.id === agentId);
if (!agent) return []; // Fail-safe: no tools if agent not found
const allowedTools = agent.allowedTools || [];
const riskTier = agent.riskTier || 'SAFE';
// Risk tier baseline restrictions
const tierBlacklist = {
'SAFE': ['exec', 'browser', 'github', 'clawhub'],
'BUILDER': ['github', 'clawhub'],
'OPERATOR': [] // Full access
};
const blocked = tierBlacklist[riskTier] || tierBlacklist['SAFE'];
return requestedTools.filter(tool =>
allowedTools.includes(tool) && !blocked.includes(tool)
);
}
Layer 2: OpenClaw Agent Config Integration
File: src/app/api/ops/dispatch-config/route.ts
// Extend agent model updates to include tool restrictions
if (agentId && (model || toolRestrictions)) {
cfg.agentMap = cfg.agentMap || {};
const entry = cfg.agentMap[agentId] || {};
if (model) entry.model = model;
if (toolRestrictions) entry.allowedTools = enforceAgentCapabilities(agentId, toolRestrictions);
cfg.agentMap[agentId] = entry;
// Write to OpenClaw config as well for runtime enforcement
updateOpenClawAgentConfig(agentId, entry);
}
Layer 3: UI Safety Indicators
File: src/components/AgentsBoard.tsx
// Add capability validation warnings
const capabilityRisk = useMemo(() => {
const dangerousTools = allowedTools.filter(t =>
['exec', 'github', 'clawhub', 'browser'].includes(t)
);
if (riskTier === 'SAFE' && dangerousTools.length > 0) {
return `RISK: ${dangerousTools.join(', ')} tools require BUILDER+ tier`;
}
return null;
}, [allowedTools, riskTier]);
// Display warning in UI near save button
{capabilityRisk && (
<div className="text-xs text-yellow-400 mt-1">β οΈ {capabilityRisk}</div>
)}
Implementation Roadmap
Phase 1: Foundation (1-2 hours)
- Add validation helper functions to existing agent APIs
- Update dispatch-config to include tool filtering
- Test capability enforcement with sample agent configurations
Phase 2: Integration (2-3 hours)
- Connect HQβOpenClaw agent config sync
- Add runtime enforcement in enqueue-run pipeline
- Validate tool filtering works end-to-end
Phase 3: UI Polish (1 hour)
- Add capability warnings in AgentsBoard
- Improve tier descriptions with specific tool examples
- Test edge cases (invalid configs, missing data)
Risk Mitigation
Backward Compatibility
- All changes are additive - no existing schemas modified
- Default fallbacks ensure agents without explicit capabilities get SAFE defaults
- Gradual rollout - can enable per-agent without affecting others
Safety Measures
- Fail-safe defaults: Unknown agents β SAFE tier, empty allowlist
- HQ override:
agt_hqalways gets OPERATOR tier regardless of config - Rollback plan: Capability filtering can be disabled via feature flag
Testing Strategy
- Unit tests for capability filtering logic
- Integration tests with sample agent configurations
- Manual verification via AgentsBoard before production use
Expected Benefits
- True Defense in Depth: Runtime enforcement prevents capability escalation
- Granular Control: Per-tool, per-agent restrictions
- Audit Trail: All capability changes logged in ops feed
- Zero Breakage: Existing agents continue working unchanged
- Gradual Adoption: Can enable strict enforcement agent-by-agent
BLOCKED
None - All required tools (exec, web_fetch) are available for this implementation.
Next Actions
- Review & approve this implementation plan
- Assign Phase 1 development work (dispatch-time filtering)
- Test capability enforcement with non-critical agent (e.g.,
agt_research) - Gradually enable for frontline agents after validation
SHIP-READY: This plan provides a clear, safe path to activate HQ's existing capability infrastructure with minimal code changes and maximum backward compatibility.