TL;DR:Starting from first principles — an agent is just a workflow that thinks like a human. Here's the 10,000ft view before we build it.
I’m starting to build a general agent framework from scratch, sharing what I’ve learned over the past few years. Let’s start from the very beginning.
What Is an Agent?
IMO, an agent is a workflow that can think like a human — do what a human can do. That concept existed even before LLMs, when we had stateful agents in backend system design.
The only reason “agents” are popular now is Large Models. We finally found a moment when agent design could be generalized — not hand-crafted for each narrow task.
The 10,000ft View: An Agent Is a PC
Back to old-fashioned computing: we have IO, a CPU, and storage.
An agent maps almost perfectly:
- CPU → LLM
- IO → connector to external devices (tools, APIs, sensors)
- Storage → memory
Yep, it’s that simple.

Over time, engineers added fancy stuff to make each component faster:
- Better CPU → better models
- Larger bandwidth → larger context windows
- More applications → more skills / MCP servers
Nothing fundamentally changed.
The Agent Heartbeat
Here’s the fake code of agent orchestration — if you know how OpenClaw works, this is pretty much the heartbeat:
while True: sleep(1000) input = read_input(context) intent_and_plan = think(context, input) execution_result = do(context, intent_and_plan) # this phase can be async sometime evaluation(context, execution_result)Simple loop: read, think, do, evaluate. Repeat.
The Event-Driven Upgrade
There’s a known problem with sleep — wasting resources waiting. The solution? Event-driven, just like JavaScript.
Claude Code’s internals indicate they’re doing the same thing. So the loop evolves:
User interaction side:
pub_sub_client = PubSubClient()
input = read_user_input()pub_sub_client.send(topic="user_input", input)result = pub_sub_client.subscript(topic="task_result")Consumer (agent) side:
user_input = pub_sub_client.subscript(topic="user_input")intent_and_plan = think(context, input)execution_result = do(context, intent_and_plan)pub_sub_client.send(topic="task_result", execution_result)# this phase can be async sometimeevaluation(context, execution_result)Clean decoupling. The agent becomes a proper event consumer.
What’s Coming
In this series, I’ll dig deeper into each component:
- IO — how the agent talks to the world
- Orchestration — prompt engineering and the harness system
- Skills — user manuals for tools
- Memory — expanding the context window
- Multi-agent — when one agent isn’t enough
Track progress and raise issues / PRs at github.com/Czhang0727/agent-from-scratch.
Auth_Verified: 2026.04.01
