HanyanOS 部署手记
1. Overview
Modern AI workloads demand more than a single monolithic chat model. In HanyanOS, we run 7 specialized OpenClaw agents on an Intel N100 mini-server, each with distinct roles, model profiles, and operational constraints. The orchestrator agent—Serena (柳含烟)—holds supreme command and routes, reviews, and audits every task across the agent mesh.
This post dissects the architecture: agent topology, delegation workflow, model assignment strategies, subagent permission gates, and the hard-won lessons from running this multi-agent assembly in production on a 4-core/6W Intel N100.
2. Architecture Overview
1 | ┌──────────────────────────────────────────────────┐ |
Key design principle: All external input enters through Serena. Sub-agents never communicate with external channels directly. This creates a single point of control for security auditing, cost tracking, and quality review.
3. Agent Definitions and Model Assignments
Each agent has a unique systemPromptOverride that defines its persona, workflow rules, and constraints. Here is the agent roster:
| Agent ID | Persona | Primary Model | Fallback Models | Subagent Access |
|---|---|---|---|---|
| main | 柳含烟 (Serena) | deepseek-v4-pro | v4-flash, gpt-4o-mini | coder, devops, tester, security, uxui, accountant |
| coder | 沈漫歌 | deepseek-v4-pro | v4-flash, gpt-4o-mini | tester only |
| devops | 苏清雅 | deepseek-v4-pro | v4-flash, doubao-flash | security, tester |
| tester | 钟离燕 | deepseek-v4-pro | v4-flash, gpt-4o-mini | none |
| security | 洛飞烟 | deepseek-v4-pro | v4-flash | none |
| uxui | 楚灵儿 | deepseek-v4-pro | v4-flash, gpt-4o-mini | tester only |
| accountant | 江惜月 | deepseek-v4-flash | gpt-4o-mini | none |
Model routing is configured via the model.primary and model.fallbacks fields. The accountant agent uses deepseek-v4-flash (cheaper, $0.14/M input) instead of deepseek-v4-pro ($1.74/M input), since its workload—token cost analysis and budget reports—does not require deep reasoning.
1 | // From openclaw.json — agent definitions |
4. Delegation Workflow
When a task arrives, Serena executes the following workflow:
1 | 1. RECEIVE task (from Feishu, WebChat, or Cron) |
Critical rules enforced by Serena’s system prompt:
“你拥有所有Agent的最高控制权、调度权、审核权与终止权。所有子Agent必须向你汇报结果。你必须审查子Agent输出质量、安全性与完整性。”
This creates a hierarchical review chain: sub-agents never deploy to production without Serena’s sign-off.
5. Agent-to-Agent (A2A) Communication
OpenClaw’s agentToAgent feature enables direct inter-agent messaging through a dedicated bus:
1 | "tools": { |
This allows, for example, the coder agent to request a security review from the security agent without going through the orchestrator for every sub-step. However, all final outputs must still pass Serena’s review gate.
Practical flow for a code change:
1 | User → Serena → coder (branches & codes) |
6. Memory System Integration
Each agent has access to the shared memory-core system with dreaming enabled:
1 | "plugins": { |
The memory-wiki plugin bridges dreaming reports, daily notes, and memory artifacts into a searchable wiki. This allows any agent to recall past decisions, configurations, and troubleshooting steps across sessions.
Session memory captures the last 60 messages per session with LLM slugging for efficient retrieval. All agents share this context, enabling continuity across delegated tasks.
7. Cost Optimization — The Accountant Agent
A standout pattern is the accountant agent (江惜月 / Jiang Xiyue). Running on the cheaper deepseek-v4-flash model ($0.14/$0.28 per M tokens vs $1.74/$3.48 for v4-pro), it:
- Calculates per-task token costs across all agents
- Tracks monthly API spend by provider
- Generates cost reports saved to
workspace/reports/ - Recommends model downgrades when appropriate
This agent runs on a nightly cron schedule (see Article #8) to produce daily cost summaries without user intervention.
1 | # Cost comparison per agent (daily average) |
8. Problems Encountered and Solutions
Problem 1: Agent Context Pollution
Issue: When an agent completed a task and stored its output, subsequent agents in the delegation chain would see the previous agent’s working context, causing hallucinations (e.g., tester assuming coder’s branch was already merged).
Solution: We enforced strict session isolation via OpenClaw’s session management. Each delegated subtask gets a fresh sub-session that only imports the task specification and relevant memory, not the full parent session history.
1 | "session": { |
Problem 2: Elevated Tool Overuse
Issue: The elevated tool profile allowed agents to execute shell commands. The coder agent, when eager to test, would run systemctl restart or apt upgrade without review, risking production stability.
Solution: We restricted elevated access to Serena only:
1 | "tools": { |
Sub-agents (coder, devops, etc.) do not have elevated: true in their agent definitions. Devops agent is further constrained with explicit scope rules:
“只执行被明确指派的任务,禁止做任何超出任务范围的检查/修改/审计。”
Problem 3: Sub-Agent Circular Delegation
Issue: The devops agent, which can delegate to security and tester, would occasionally enter a delegation loop: devops → security → devops → tester → security → devops. This wasted tokens and confused the agents.
Solution: We enforced strict subagent allow lists per agent:
main→ all agentscoder→ tester onlydevops→ security, testeruxui→ tester only- All others → none
This creates a DAG (Directed Acyclic Graph) where no circular delegation is possible. Serena remains the only agent with full visibility.
Problem 4: N100 Resource Contention
Issue: Running 7 agents simultaneously on an Intel N100 (4 cores, 6W TDP) with local Ollama models caused CPU thrashing. The qwen2.5:1.5b fallback model, while cheap, competed for resources with the main gateway process.
Solution: We moved local Ollama models to a secondary fallback tier and configured agent concurrency limits:
1 | "agents": { |
All primary agent traffic routes through remote API providers (DeepSeek, OpenAI, Volcengine). Local Ollama is used only as a last-resort fallback when all remote APIs are unreachable.
9. Memory and Reports Directory Structure
All agents persist their work to a shared workspace:
1 | ~/.openclaw/workspace/ |
This structure enables Serena to cross-reference reports across agents and generate unified daily summaries.
10. Summary
The OpenClaw multi-agent architecture on HanyanOS demonstrates a production-grade pattern for hierarchical AI agent orchestration:
- 7 specialized agents with distinct personae, tools, and model assignments
- Hierarchical delegation with Serena as the single review gate
- DAG-based subagent permissions preventing circular delegation
- Cost-aware routing (accountant on flash model, others on pro)
- Persistent memory and reporting for auditability and continuity
- N100-appropriate resource management favoring remote APIs over local compute
Total operational cost: approximately $8-12/day across all agents, with nightly accountant reports keeping spend visible and optimized.
Next in series: #7 — Layered Memory Architecture — Dreaming, Wiki Bridge, and Session Persistence