HomeDocsArchitecture › 2. System Architecture

2. System Architecture

Rysh's architecture rests on three pillars. Understanding them unlocks the entire codebase, because the same three ideas recur in the CLI, the server, and the shared core.

  1. The actor model — concurrency via supervised, message-passing actors (protoactor-go).
  2. NATS as the universal message bus — typed Go messages, JSON-encoded in an envelope, routed by subject.
  3. The agentic loop — a per-pane manager actor that spawns an orchestrator running the LLM tool-use loop.

2.1 Pillar 1 — The actor model

Both the CLI daemon and the server use github.com/asynkron/protoactor-go. An actor is a struct implementing Receive(ctx actor.Context) with a switch ctx.Message().(type). The runtime guarantees each actor processes one message at a time from its mailbox, so actor fields need no locks.

Two message transports coexist:

  • In-process protoactor sends (ctx.Send, ctx.Spawn, ctx.Stop) — used for tightly-coupled parent↔child coordination and control messages that never leave the process (e.g. the WorkspaceFarm↔Workspace messages, which are not in the codec).
  • NATS — the primary data plane, used for everything that may cross a process or machine boundary (TUI↔daemon, sharing, browser control). A NATSBridge (see §2.2) subscribes subjects on an actor's behalf and pushes decoded messages into that actor's mailbox, so the actor's Receive sees NATS traffic and in-process sends uniformly.

The CLI actor tree

graph TD
    Farm["WorkspaceFarmActor
session root"] Farm --> WS["WorkspaceActor
1 per workspace; 1 active"] WS --> Tab["TabActor
per tab; owns pipeline"] Tab --> Lane["LaneActor
column · flex weight"] Lane --> PG["PaneGroupActor
row slot · rowFlex"] PG --> Pane["PaneActor
PTY + VTerm"] Pane --> LLM["LLMPromptExecutionActor
per-pane agent mgr"] Pane --> Mem["MemoryManagerActor"] LLM --> Orch["OrchestratorActor
per prompt"] WS -.spawns on primary.-> AR["AgentRegistryActor"] WS -.-> HR["HumanoidRegistryActor"] WS -.-> SR["ShareRegistryActor"] AR --> Agent["AgentActor"] HR --> Human["HumanoidActor"] SR --> Up["UpstreamShareActor"]

Entity semantics (this terminology is used everywhere):

Actor Represents Key field
WorkspaceFarmActor The session root; owns N workspaces, exactly one "active"
WorkspaceActor A workspace = a config-defined set of tabs (own upstream/API key) owns ws.inbox/ws.snapshot
TabActor A tab = a screen of lanes pipeline state
LaneActor A column flex (horizontal width weight)
PaneGroupActor A row slot in a lane, holding a stack of panes rowFlex (vertical weight)
PaneActor One PTY shell + optional LLM owns *vterm.VTerm

Splitting "down" makes a new pane group; "stacked" panes share a group and rotate. A grid seeds R×C lanes/groups in one shot.

Supervision & persistence

Actors restore from JetStream KV in their *actor.Started handler, so a rysh attach after the daemon was restarted rebuilds the whole tree. Writes are debounced (mark-dirty + flush on Stopping). Snapshots cascade down the tree via NATS request/reply, fanning out concurrently at the lane/group level; the workspace memoizes a full snapshot and a layout-only snapshot, invalidated on persist.


2.2 Pillar 2 — NATS as the universal message bus

Process / transport model

graph LR
    subgraph client["rysh TUI process"]
        Model["Bubble Tea Model
(pure NATS client —
does NOT import actors)"] end subgraph daemon["rysh daemon process"] Bus["bus.Bus
embedded nats-server
+ JetStream KV
+ protoactor System
+ CodecRegistry
+ NATSPublisher"] Tree["Actor tree"] Bridge["NATSBridge
(per actor)"] Bus <--> Bridge <--> Tree end Model <-->|"NATS over TCP
127.0.0.1:<port>"| Bus CLI["cli.Client
(short-lived: rysh tab/pane/--cmd)"] -->|Send / Request| Bus

The daemon's bus.Bus is the spine: an embedded nats-server (default client port 24242, JetStream KV buckets rysh-{panes,workspace,pipeline,agents}-{session}), the protoactor ActorSystem (root actor workspace-farm), a CodecRegistry, and a NATSPublisher. TUI clients and the short-lived cli.Client connect to its client port. The optional web UI listens on 23232.

The wire format

Every message on NATS is a typed Go struct serialized into a JSON envelope:

// NATSEnvelope
{
  "t": "MsgConversationAppend",   // type tag (discriminator)
  "r": "_INBOX.abc...",           // reply-to subject ("" = fire-and-forget)
  "p": { /* the inner message, embedded JSON */ }
}

The payload p is a json.RawMessage, so the inner JSON is embedded verbatim (avoiding the base64 inflation a []byte would cause). The CodecRegistry maps type tags ↔ Go types: TagOf(msg) for encode, Decode(tag, payload) for decode. The NATSBridge unmarshals the envelope, decodes via the registry, and either delivers the inner message directly to the actor mailbox or — if a reply is expected — wraps it in a RequestEnvelope whose .Reply(resp) serializes and publishes the response.

Note: the rysh-proto schema for NATSEnvelope declares type_tag/reply_to/bytes payload (protobuf), but the runtime is JSON with short keys {t, r, p}. The proto module is a reference schema only (§9).

Subject naming scheme

Subjects are namespaced by a session prefix (default rysh, settable via SetSessionPrefix). T(parts...) joins them as {session}.{parts...} (literally sessionPrefix + "." + strings.Join(parts, ".")):

{session}.ws.inbox                                   workspace command inbox
{session}.ws.snapshot                                layout/snapshot request-reply
{session}.tab.{id}.inbox                             tab commands
{session}.pane.{id}.inbox                            pane commands
{session}.pane.{id}.output                           merged output (shell+ai dual-publish)
{session}.pane.{id}.output.{mode}                    per-mode output (shell/ai/rysh/chat/email/slack/chatbot)
{session}.pane.{id}.history[.{mode}]                 conversation history
{session}.pane.{id}.llm_prompt_execution.{inbox|output|status}    the per-pane agent
{session}.pane.{id}.approval.{request|response}      tool-approval handshake
{session}.pane.{id}.browser.{request|response}       browser-control handshake
{session}.pane.{id}.memory.{mode|summarize}          conversation memory
{session}.pane.{id}.relay.data                       native-speed PTY relay
{session}.agent.{name}.inbox                          headless agent
{session}.humanoid.{name}.inbox                       humanoid agent
{session}.pane-group.{id}.inbox                       ephemeral approval-pane creation

The server uses a separate, workspace-scoped scheme for cross-machine sharing: ws.{workspaceID}.share.{shareID}.{output|command|command.ack|...}. This subtree is what the subject-ACL confines clients to (§7.4).

Dual-publish. shell and ai modes publish to both the per-mode topic and the merged output topic; other modes publish only per-mode. Modes ai, email, slack, chatbot have memory.


2.3 Pillar 3 — The agentic loop

Every pane that can run an agent owns an LLMPromptExecutionActor (the per-pane manager). It holds the conversation, system prompt, auto-approval set, and optional memory state, and persists the conversation to JetStream KV. On each user prompt it spawns a fresh OrchestratorActor that runs the autonomous tool-use loop to completion.

sequenceDiagram
    participant U as User / channel
    participant M as LLMPromptExecutionActor
    participant O as OrchestratorActor
    participant P as Claude provider
    participant T as Tool
    participant A as Approver (TUI / pane)

    U->>M: MsgAgenticPrompt
    M->>M: append user turn, trim to 50, build sys prompt + memory
    M->>O: spawn (conversation, tools, ctx)
    loop until end_turn / max iterations / cancel
        O->>P: CompleteWithTools(conv, toolSpecs, sysPrompt)
        P-->>O: AgenticResponse(text, toolCalls, stopReason)
        O-->>U: emit text (MsgAgenticOutput / ConversationMessage)
        alt tool requires approval
            O->>A: MsgApprovalRequest (diff or destructive)
            A-->>O: MsgApprovalResponse (yes / yes_always / no)
        end
        O->>T: Execute(params)
        T-->>O: ToolOutput
        O->>O: append assistant + tool-result turns
    end
    O->>M: MsgOrchestratorDone(full conversation)
    M->>M: merge conversation, persist to KV

Key behaviors (full detail in §4):

  • Conversation trimming — kept to the last 50 turns; KV persistence throttled to 2s; maxIterations default 20 (CLI raises to 50).
  • Loop detection — the orchestrator hashes toolName:params and blocks (returns an error to the LLM) if the same call repeats ≥3 times in the last 20.
  • Approval gatingfile_edit/file_write execute first to compute a diff, then ask (preview-then-confirm); other destructive tools ask before executing. waitForApproval times out at 5 minutes (defaulting to No). yes_always records an auto-approval key (by file path or bash command word).
  • Context compaction — each iteration the orchestrator checks reported input-token usage; past 75% of the context limit (default 160 000 tokens) it summarizes and drops the oldest turns (keeping the most recent ~12, never splitting a tool_use/tool_result pair) and emits a context-% status line.
  • Provider abstractionCompleteWithTools/CompleteWithToolsStream map to the Anthropic Messages API (the orchestrator prefers streaming when available, else non-streaming). Tool results are sent back as user messages with tool_result blocks; prompt caching is on by default, the response token Usage feeds compaction, and extended thinking is supported. Transient errors are retried inside the provider (5 attempts, exponential backoff, optional model fallback).
  • Sub-agents, permissions, stale-edit guards — the sub_agent tool spawns a depth-limited (MaxSubAgentDepth = 2) child orchestrator; a declarative permission policy can pre-allow/deny tool calls before the approval prompt; a read-tracker rejects edits to files changed on disk since they were read; all tool output is head/tail-truncated (shapeToolOutput).

The same engine runs locally (CLI panes/agents/humanoids) and in the cloud (server browser panes and chatbots) — only the tool registry differs. See 4. Agentic Engine for the full treatment.


2.4 How the three pillars compose: an input from keystroke to agent

graph TD
    Key["TUI: user presses Enter"] -->|MsgSubmitInput → ws.inbox| WS["WorkspaceActor.handleSubmitInput"]
    WS -->|"starts with ##"| Rysh["runRyshCommand
(system command)"] WS -->|"starts with @ / @@"| Agent["route to agent/humanoid"] WS -->|"starts with ####"| Relay["relay to share source pane"] WS -->|"normal"| Pane["MsgPaneSubmitInput → pane.{active}.inbox"] Pane --> Disp{Pane mode?} Disp -->|shell| Shell["executeShell → PTY"] Disp -->|ai| Prompt["executePrompt → LLMPromptExecutionActor"] Disp -->|chat| Chat["executeChat"] Disp -->|rysh| Sys["executeRysh"] Prompt --> Loop["agentic loop (Pillar 3)"] Loop -->|"output.ai / output topics"| Stream["streamed to TUI content plane"]

Structural operations (create/close/focus/resize a pane) instead cascade down the actor tree: Workspace → Tab → Lane → PaneGroup → spawn PaneActor. Read operations (snapshots) cascade down via NATS request/reply and fan out concurrently.


2.5 Rendering architecture (CLI TUI)

The TUI is deliberately decoupled: snapshots carry layout only, while per-pane content is streamed and reconciled. This keeps the Bubble Tea View() cheap and the daemon authoritative.

graph LR
    PTY["PaneActor PTY"] --> VT["VTerm (vt10x fork
+ scrollback)"] VT -->|"output.{mode} deltas"| Stream VT -->|"raw VT screen (50ms)"| RawPull subgraph TUI Stream["content stream
(wildcard subs)"] --> Buf["per-pane content buffers"] RawPull["raw VT fast-pull
(interactive panes)"] --> Buf Backfill["full backfill on
first visibility"] --> Buf Reconcile["~1.5s reconcile
(heal dropped deltas)"] --> Buf Buf --> Rehydrate["rehydrateSnapshot()"] Rehydrate --> View["lipgloss View()"] end

For fullscreen interactive apps (vim/htop/Claude Code), the TUI escapes the snapshot/render path entirely via a PTYRelay: Bubble Tea releases the terminal and proxies stdin↔PTY over pane.{id}.relay.data at native speed.

See 6. rysh-cli for details.


2.6 Trust boundaries

graph TB
    subgraph trusted["Trusted (local daemon)"]
        CLIagent["CLI panes/agents/humanoids
(full toolbox: bash, file, git...)"] end subgraph semi["Semi-trusted (authenticated clients)"] ChromePane["Chrome extension pane
(PaneToolsBrowser)"] end subgraph untrusted["Untrusted (public web)"] ChatbotPane["Chatbot pane
(PaneToolsNone — no tools, text only)"] end CLIagent --- note1["local FS/shell access"] ChromePane --- note2["browser control only,
execute_js needs approval"] ChatbotPane --- note3["cannot touch host;
closed widget protocol,
never sees raw subjects"]

The server deliberately gives zero tools to chatbot panes (they run on third-party sites), a curated browser toolset to extension panes, and the full toolbox only to trusted local CLI agents. Multi-tenant isolation is enforced by the subject-ACL (§7.4).