Introduction
Despite the Hype about OpenClaw and its trendy reach about how fascinating it is doing such amazing things AUTONOMOUSLY, we usually get Thoughts about how they achieve, what System Design they have like that
And since it is Open Source it’s easy for us to do Research 💪
Let us Dive In!
🦞 OpenClaw
In previous post we saw what OpenClaw is? How to setup **SECURELY**.
In One Line: OpenClaw is a very good Agentic System Design
Here let us explore the Architecture behind OpenClaw or System Design of OpenClaw
🏗️ OpenClaw Architecture
Overall: It gets User Input from Telegram, Discord etc., and talks to OpenAI, Anthropic, etc., and Executes Tools and Replies back to User
Channel Adapters
We can use/connect OpenClaw from various sources right? From Telegram, Discord etc., which have various formats for each
-
Normalization: We need Channel Adapters to Normalize those messages into unified format
-
Attachment Extraction: Of various media source like Documents, Voice Message into consistent format to process
Here AI is not involved MUCH, just very good System Design
🚪 Gateway: The Core Architecture
It is the core Architecture like it’s the WebSocket + HTTP Server that runs on the Machine
It’s like managing traffic, accepting Connection from Clients like Telegram, Discord and routes them to specific “Sessions”
Session Router
Determines which session should handle a current incoming message.
Like DM might go into main session and group chats go into separate session.
Lane based Command Queue
Very important concept: Smart Choice!
Every Session will have their own Lane and in that Lane too, each Message is Queued
For example if the User sends 4 Messages while Agent is Busy, then those Messages are Queued to prevent Race Condition or Hallucinated continuity making the OpenClaw Deterministic and Reliable
⚡ Input Types: The Aliveness
OpenClaw is Proactive due to some factors we see here. Where we don’t initiate chat (Reactive) but OpenClaw Initiates.
Some of those which cause OpenClaw to (wake up and) do Tasks without Human presence
Messages
The Standard message that is interacted via text as usual text like in Telegram, Discord etc.,
Heartbeats
It is the ‘Timer’ like we can set up to trigger in interval, where in HEARTBEAT.md we provide the Instruction and Configure that in ~/.openclaw/openclaw.json like,
....
"heartbeat": {
"every": "3h",
"activeHours": {
"start": "08:00",
"end": "22:00",
"timezone": "Asia/Kolkata"
}
}
....
Crons
Events that are Scheduled for Specific Times (Intervals)
Webhooks
External triggers like for example, if a GitHub issue is opened then Webhook is triggered and notifies OpenClaw to Act on
Hooks
Internal State changes like System Booting or Agent finish a Task etc.,
🤖 The Agent System: AI Execution Runtime
Here is where the actual AI comes to do Action
- Model Choosing: It will select the LLM to use, which we provide during the
Onboardingphase at first - System Prompt Loading: Assembles Prompts from Skills, Tools
- Context Window Guard: It makes sure to effectively use the Context Window like if the Context Window is full it will summarize the Context (after the Memory Flush Operation) or Fails safely
It also uses AGENTS.md, BOOTSTRAP.md, HEARTBEAT.md, IDENTITY.md, SOUL.md, TOOLS.md, USER.md under .openclaw/workspace in Prompt
Also Agent reads Session History stored as JSON Lines (JSONL) File in .openclaw/agents/<agent_id>/sessions Path
🛠️ Tools and Nodes
OpenClaw is popular because it has the “Deep Access” to the Machine that it can perform any Action on the Machine to Complete the Task
Tools
- Browser: It includes a dedicated, Chromium Instance controlled via Chrome DevTools Protocol that allows the OpenClaw to Browse the Web, Take Snapshot, and related Operations
- System Tools: OpenClaw can execute Shell Commands, Manage Files and do such related Operations
Nodes
The External Device that the Architecture supports external devices or Physical Hardware to control like to use Camera, Microphone, etc.,
🧠 Memory Features
Memory in OpenClaw is designed to be radically simple, it mostly relies on FILES!
Everything is in AGENTS.md, BOOTSTRAP.md, HEARTBEAT.md, IDENTITY.md, SOUL.md, TOOLS.md, USER.md, memory/YYYY-MM-DD-ID.md (many files) in ~/.openclaw/workspace Path and also Session Files
Storage: OpenClaw’s Brain
Its Brain is in the File System
-
Local Persistence: All the states like User Preferences, Conversation History and Context is stored as local Markdown and JSONL (JSON Lines) file Format on the Machine
-
Transparency: Since Memory is the Markdown File, we can open the File and see exactly what the Agent knows.
They say: “Files beat abstractions” and “Explainability beats cleverness”
Retrieval Process
It reconstructs its Memory every time it wakes up
- State Reconstruction: When an Event (like Message or Heartbeat or any other) Triggers the Agent, the Agent Runner reads the specific session file in the Context
- Static Memory: Files like
AGENT.md,SOUL.mdetc., are loaded - Dynamic Memory: The Conversation History is read from the Session Logs
Of all those Context Window is managed accordingly like “Compacted” just like said before
📤 Response
Either in Web we can see the Response from LLM which is Streamed
Or
Channel Adapter that Transforms the Unified Response Format back to Original Sender’s Platform Specific Message
Thanks for reading!