TL;DR:HermesAgent has a built-in delegate_task tool. I found the problem with it — and built process-isolated sub-agents that actually retain what they learn.
The Problem with Hermes’s Built-in Multi-Agent
HermesAgent ships with delegate_task — it spins up sub-agents in-process, fast and simple. But look at the source code:
DELEGATE_BLOCKED_TOOLS = frozenset({"delegate_task", "clarify", "memory", ...})child = AIAgent(..., skip_memory=True, ...)Every insight a sub-agent develops dies when the thread exits. The swarm does work, but never gets smarter.
That’s the fundamental problem. Sub-agents are disposable compute, not collaborative intelligence. I wanted something different.
What I Built Instead
Each sub-agent is a complete Hermes instance — own OS process, own config, own state, full memory access.

The Lifecycle
Spawn → Execute → Handoff → Complete → Merge Learnings → Cleanup- Spawn:
spawn-agent.shsnapshots the main agent’s config into an isolated instance - Execute: The sub-agent runs with full autonomy — no restricted tools, real memory
- Handoff: Sub-agent writes a structured handoff with findings, memory updates, and skill recommendations
- Complete:
complete-agent.shvalidates the handoff, sends results via message queue, deletes the instance directory immediately - Merge: The main agent absorbs learnings through the native memory pipeline
Instances are ephemeral. Learnings are permanent.
Mistakes I Made Along the Way
Zombie agents in the registry. Strict bash mode + missing handoff file = the cleanup script exits early, leaving dead entries behind. Fixed with graceful degradation — always clean up the registry, even on failure.
Agent ignored my sub-agent skill. Given a choice between native delegate_task and my shell script approach, the LLM picked the simpler option every time. The model naturally gravitates to the path of least resistance. Fixed by adding a Decision Guide explaining when each approach is appropriate — now the agent knows when to use the lightweight in-process delegate vs. when to spin up a full isolated instance.
Wrong API keys. The spawn script was pulling from the global Hermes install instead of the project-local agent. Fixed to fork from the running instance so the sub-agent inherits the correct context.
Why This Matters
The core insight: learning shouldn’t be scoped to a thread lifetime.
If you’re building a multi-agent system and your sub-agents can’t retain what they discover, you’re running an expensive stateless compute cluster, not a system that gets smarter over time.
Process isolation costs more than in-process threads. But it buys you:
- Real memory that persists across the agent’s lifetime
- No cross-contamination between concurrent agents
- Clean handoff artifacts you can inspect and audit
- Agents that actually accumulate knowledge
All experiments done with Qoder’s expert mode — highly recommended for long-running agentic tasks where you want the agent to make mistakes, learn, and fix them autonomously.
GitHub
Full implementation: github.com/Czhang0727/agent-from-scratch
Next: how skill management keeps the main agent sane as the number of skills grows.
Auth_Verified: 2026.05.15
