HanyanOS 部署手记

发表于 2026-05-16 分类于 tech

Introduction

HanyanOS is not an operating system in the traditional kernel sense — it is an Agent Operating System: a purpose-built runtime that orchestrates AI agents, self-hosted services, network tunneling, and automated governance on commodity hardware. The entire stack runs on a single Intel N100 mini-PC in Brisbane, Australia, fronted by an AWS Lightsail instance in Singapore for edge ingress.

This post is the first in a series documenting the full deployment architecture, design decisions, and operational lessons learned.

Hardware & Topology

    Internet
        │
┌───────┴────────┐
│  SG Lightsail   │
│  t3.nano        │
│  52.220.247.252 │
│  (Edge Ingress) │
└───────┬────────┘
        │ FRP / Xray
        │
┌───────┴────────┐
│  Brisbane N100  │
│  Intel N100     │
│  4C/4T, 11GB    │
│  Ubuntu 24.04   │
│  (Service Hub)  │
└────────────────┘

Node Specifications

Node	Role	Spec	Location
SG Lightsail	Edge Ingress	t3.nano (0.5GB, 20GB)	AWS ap-southeast-1
Brisbane N100	Service Hub	Intel N100 (11GB, 1TB SSD)	Home, Australia

The architectural philosophy is deliberate: the VPS is kept intentionally thin — it runs only Nginx (stream SNI), Xray (VPN), and FRP server. All business logic, databases, AI models, and application services reside on the N100. This minimizes attack surface and operational complexity at the edge.

The Three-Layer Architecture

HanyanOS follows a strict three-layer isolation model:

┌──────────────────────────────────────────────────┐
│                  L3: Governance                   │
│  OpenClaw · Memory System · Rules Engine         │
│  Cron Scheduler · n8n Automation                 │
├──────────────────────────────────────────────────┤
│                  L2: Services                     │
│  WordPress · MySQL · Stalwart Mail · Ollama AI   │
│  SnappyMail · AI API Gateway                     │
├──────────────────────────────────────────────────┤
│                  L1: Infrastructure               │
│  Docker · FRP Tunnel · Nginx · Xray · UFW        │
│  acme.sh · Fail2ban · SSH                        │
└──────────────────────────────────────────────────┘

L1 — Infrastructure Layer

The foundation is built on Docker containers for service isolation, FRP for secure tunneling from the VPS to the N100 (which has no static public IP), and a multi-layered Nginx reverse proxy with SNI-based routing.

Key infrastructure components:

12 Docker containers running on the N100
FRP v0.61.0 tunnels replacing legacy SSH reverse tunnels
Xray REALITY with VLESS+Vision+XTLS for VPN access
UFW + Fail2ban for host-level security

L2 — Services Layer

All production services are containerized and bound to loopback interfaces on the N100, accessible only via the FRP tunnel or local network:

Service	Stack	Port	Container
Blog	WordPress + Apache + PHP	8081	serena-wp
Database	MySQL 8.0	3307	hanyan-db
Mail	Stalwart Mail Server	25/465/587/993	stalwart-mail
Webmail	SnappyMail	8091	snappymail
AI	OpenClaw Gateway + Ollama	18789/11434	systemd

L3 — Governance Layer

This is what distinguishes HanyanOS from a typical homelab setup. The governance layer provides:

Multi-Agent Orchestration via OpenClaw — 7 specialized agents (main, coder, devops, tester, uxui, accountant, security)
Hierarchical Memory System — 6-layer memory with core identity data, user preferences, infrastructure state, project knowledge, session archives, and ephemeral cache
Cron-based Automation — 15+ scheduled tasks including nightly patrol, dream processing, backup rotation, and SSL renewal
n8n Workflows — 3 active automation pipelines for hot topic detection and webhook processing

The SNI Routing Design

One of the most critical pieces is the Nginx stream directive with ssl_preread on the Lightsail VPS. All seven domains share port 443, with SNI-based routing:

# /etc/nginx/nginx.conf — stream block
stream {
    map $ssl_preread_server_name $backend {
        blog.chenyun.org     web_backend;
        www.chenyun.org      web_backend;
        chenyun.org          web_backend;
        ai.chenyun.org       web_backend;
        api.chenyun.org      web_backend;
        mail.chenyun.org     web_backend;
        vpn.chenyun.org      xray_backend;
        default              xray_backend;
    }

    upstream web_backend {
        server 127.0.0.1:8443;
    }

    upstream xray_backend {
        server 127.0.0.1:8444;
    }

    server {
        listen 443;
        ssl_preread on;
        proxy_pass $backend;
    }
}

The secondary Nginx on port 8443 terminates SSL and handles reverse proxying:

# /etc/nginx/sites-enabled/blog-ssl
server {
    listen 127.0.0.1:8443 ssl http2;
    server_name blog.chenyun.org;

    ssl_certificate     /etc/ssl/chenyun/fullchain.cer;
    ssl_certificate_key /etc/ssl/chenyun/chenyun.org.key;

    location / {
        proxy_pass http://127.0.0.1:7444;  # FRP local port
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}

This design allows us to serve six different web applications behind a single public IP, with zero configuration changes when adding new services — just add a new server_name block and a FRP tunnel.

FRP: Replacing SSH Tunnels

The original architecture used SSH reverse tunnels (ssh -R), which proved fragile. After migrating to FRP v0.61.0, the tunnel topology became:

SG Lightsail (FRPS)                 N100 (FRPC)
  :7443 ─── control ───→  (frpc connects outbound)
  :7444 ─── blog ───────→  :8081 (WordPress)
  :7445 ─── ai/api ─────→  :8090 (AI services)
  :7446 ─── smtp ───────→  :25   (Stalwart)
  :7447 ─── submission ─→  :465  (Stalwart)
  :7448 ─── imaps ──────→  :993  (Stalwart)
  :7449 ─── webmail ────→  :8091 (SnappyMail)

FRP configuration is minimal and reliable:

# /etc/frp/frps.toml (server)
bindPort = 7443
token = "chenyun-frp-2026"

# /etc/frp/frpc.toml (client)
serverAddr = "52.220.247.252"
serverPort = 7443
token = "chenyun-frp-2026"

[[proxies]]
name = "blog"
type = "tcp"
localIP = "127.0.0.1"
localPort = 8081
remotePort = 7444

Key improvement over SSH tunnels: FRP provides automatic reconnection, health checks, and a clean port management model. The previous SSH tunnel required manual process supervision and left dangling ports on 0.0.0.0.

The Agent OS: OpenClaw & Multi-Agent Coordination

HanyanOS runs on OpenClaw, an open-source AI agent runtime. The system orchestrates 7 specialized agents:

            ┌─────────────┐
            │  Serena 🧠  │  (Main — Orchestrator)
            └──────┬──────┘
     ┌─────────────┼─────────────┐
     │             │             │
┌────┴───┐   ┌────┴───┐   ┌────┴───┐
│ Devops │   │ Coder  │   │ Tester │
│ 苏清雅 │   │        │   │ 钟离燕 │
└────────┘   └────────┘   └────────┘
┌────────┐   ┌────────┐   ┌────────┐
│ UX/UI  │   │Account.│   │Security│
└────────┘   └────────┘   └────────┘

Each agent has:

An isolated workspace directory
A dedicated memory namespace
A personality definition (soul) crafted by Serena
The ability to communicate cross-agent via OpenClaw’s agentToAgent protocol

The coordination follows a strict workflow: Serena delegates → agent executes → agent reports → Serena reviews → final summary.

Memory System: 6-Layer Hierarchy

The memory architecture is file-based (Markdown as source of truth, JSON as index), with six distinct layers:

Layer	Location	Content	Volatility
Core	`memory/core/`	Identity, personality	Extremely low
User	`memory/user/`	User preferences, habits	Low
Infrastructure	`memory/infrastructure/`	Network topology, ports, SSL, DNS	Medium
Projects	`memory/projects/`	Active project states	Medium-High
Summaries	`memory/summaries/`	Conversation compression	Ephemeral
Cache	`memory/cache/`	Temporary data	Very high

This separation is critical: when the context window is compressed, non-essential layers are pruned first. Infrastructure state is always preserved because it’s referenced by virtually every task.

Security Architecture

Security follows a defense-in-depth approach across three layers:

Layer 1 — VPS Edge (SG Lightsail)
  ├── UFW: default deny, 9 ports whitelisted
  ├── Fail2ban: SSH + Nginx rate limiting
  ├── No business logic on VPS
  └── FRP ports restricted to N100 backends only

Layer 2 — Tunnel (FRP)
  ├── Token-authenticated control channel
  ├── Service-specific port mapping
  └── All tunnels originate from N100 outbound

Layer 3 — N100 Service Hub
  ├── Services bound to 127.0.0.1 where possible
  ├── Docker network isolation
  ├── Local UFW with service-specific rules
  └── Ongoing: automated security patching

Problem-Solving Case Study: The Open Port Incident

During initial setup, the VPS had UFW inactive with 15+ exposed ports including SSH reverse tunnels bound to 0.0.0.0. A security audit revealed:

Discovered risks:

Ports 2222, 8090, 10025, 10143, 10465, 10587, 10993 all public
FRP management ports 7443-7451 fully exposed
Netdata on 19999 broadcasting system metrics publicly
No rate limiting on API endpoints

Resolution steps:

Enabled UFW with default deny incoming
Whitelisted only 9 essential ports (22, 80, 443, 25, 465, 587, 993, 8443, 7443)
Closed all legacy SSH tunnel ports
Bound internal services to 127.0.0.1 on the N100
Documented the entire security posture in infrastructure memory

The lesson: a VPS with 0.5GB RAM should not be a general-purpose server. Its role is strictly edge routing.

Operational Automation

The system runs 15+ cron jobs for self-maintenance:

03:00 — Dream pipeline (AI reflection processing)
03:30 — Memory optimization (compaction, indexing)
04:00 — Daily backup rotation (7-day retention)
07:00 — Wake cycle (daytime resource scaling)
08:00 — Morning patrol (health checks, anomaly detection)
20:56 — SSL certificate renewal check (acme.sh)
23:00 — Sleep cycle (nighttime power saving)
Every 2h — Emotion engine reflection
Every 4h — Knowledge consolidation

Lessons Learned

FRP over SSH every time. SSH reverse tunnels lack reconnection logic and leak ports. FRP is production-grade and cost-free.
Document as you deploy. Every configuration change must be reflected in the infrastructure memory. Without this, the AI agents lose situational awareness.
Single public IP, many domains. Nginx ssl_preread + SNI routing on port 443 is the cleanest way to multiplex services behind one IP.
Edge should be anorexic. The less software runs on the VPS, the smaller the blast radius.
File-based memory over databases. For an AI system, Markdown files are more resilient, human-readable, and LLM-friendly than SQL tables.

What’s Next

This series will continue with deep dives into each component:

#2: Nginx Reverse Proxy — SNI Routing for 7 Domains
#3: FRP Tunnels — Secure Penetration from VPS to N100
#4: Stalwart Mail Server — Self-Hosted Email
#5: WordPress + MySQL — Dockerized Blog Deployment

The full source of truth for the deployment lives in HanyanOS/memory/infrastructure/, and the Agent OS runtime is open source at github.com/openclaw.

This is the first entry in the HanyanOS Deployment Journal series, documenting a production-grade Agent OS running on commodity hardware.