HomeDocsArchitecture › 6. `rysh-cli`

6. rysh-cli — The Multiplexer & Agent Runtime

rysh-cli (~60K LOC Go) is the product's heart: a terminal multiplexer in which panes can be shells or AI agents, built as a client/daemon application on top of a protoactor-go actor system and an embedded NATS bus. This is the largest sub-project; this chapter maps its packages, the actor hierarchy, the TUI, terminal emulation, the agent toolbox, channels, voice, sharing, and the embedded web UI.

Build note. rysh-cli is intentionally excluded from go.work (it pins its own dependency tree). Build and test it with GOWORK=off. It consumes rysh-shared via a replace in its own go.mod.


6.1 Process model: client, daemon, CLI

graph TB
    subgraph procs["Processes"]
        direction LR
        TUI["TUI client
(Bubble Tea)"] Daemon["daemon
(headless backend)"] CLIc["cli.Client
(short-lived)"] end TUI <-->|NATS TCP| Daemon CLIc -->|"Send / Request"| Daemon Daemon --- NATS["embedded nats-server
+ JetStream KV"] Daemon --- Tree["actor tree"]

main.go is a hand-rolled argument router (not cobra). The installed binary is named ry (with rysh as an alternate name; the examples below use rysh). Commands:

Command Effect
rysh (no args) auto-attach to an existing session daemon, or spawn one and attach a TUI
rysh daemon <name> (internal) run the headless backend: start NATS + the actor tree; block on SIGTERM/SIGINT
rysh attach [--upgrade] / create / detach / stop / list-sessions / delete-session session lifecycle
rysh tab / lane / pane-group / pane / stacked-pane / pipeline <list|create|delete|…> entity-management subcommands (each with its own sub-verbs)
rysh send send input to a pane
rysh version / -V print version
rysh --pane / --cmd / --share / ... CLI equivalents of in-TUI ## system commands

The daemon survives terminal close (goHeadless() re-execs detached). Detach is a graceful SIGUSR1 to the attached TUI PIDs. Multiple TUIs can attach to one daemon; their PIDs are tracked in the session record.

Upgrade-on-attach. On attach (and the no-arg path), the client compares the running daemon's recorded binary Version/BinHash against its own; if they differ it can restart the stale daemon with the new binary (maybeUpgradeDaemonupgradeDaemon), controlled by the --upgrade flag and the configured upgrade-on-attach mode.

bus.Bus (internal/bus/bus.go)

The daemon spine: an embedded nats-server (JetStream on, 127.0.0.1, default client port 24242, ~/.local/state/rysh/nats store dir), a protoactor ActorSystem (created with an io.Discard logger to silence startup noise), a CodecRegistry, and a NATSPublisher. On start it first tries to connect to an existing server on the port and reuse it, else starts its own. The JetStream KV buckets are session-namespaced (bus.go:135):

rysh-panes-{session}      rysh-workspace-{session}
rysh-pipeline-{session}   rysh-agents-{session}      (all FileStorage; empty session → "default")

A fifth KV bucket, rysh-memory (actors/memory.go), backs per-pane memory summarization (see §6.2). Unlike the four above it is not session-namespaced and is not removed by session.store.Delete (which only deletes the four -{session} buckets).

ClientPort() is what TUIs and the cli.Client connect to (0 in external mode). The root actor is WorkspaceFarmActor, spawned named "workspace-farm". The web UI defaults to port 23232.

Session store (internal/session/store.go)

A file-backed registry (one <name>.json per session, mode 0644):

type Record struct {
    Name      string    `json:"name"`
    Path      string    `json:"path"`
    State     string    `json:"state"`     // detached | running | stopped
    PID       int       `json:"pid,omitempty"`
    TUIPIDs   []int     `json:"tui_pids,omitempty"`
    NATSPort  int       `json:"nats_port,omitempty"`
    UpdatedAt time.Time `json:"updated_at"`
    Version   string    `json:"version,omitempty"`  // daemon binary version, for upgrade-on-attach
    BinHash   string    `json:"bin_hash,omitempty"` // daemon binary hash, compared on attach
}

Liveness uses Signal(0); terminatePID sends SIGTERM, polls 10×100ms, then SIGKILL. Delete kills the daemon PID and removes the session's four JetStream KV buckets (both live via js.DeleteKeyValue and on disk under {natsDir}/jetstream/$G/streams/KV_{bucket}) plus working-dir artifacts (.rysh/, .rysh-notes.md). The daemon survives terminal close via goHeadless() (dup /dev/null to stdio + ignore SIGHUP) and SysProcAttr{Setsid: true}; spawnDaemon polls every 50ms (up to 10s) for the daemon to report a NATS port.


6.2 The actor model (internal/actors/)

This is the core. See §2.1 for the hierarchy diagram. Highlights:

Entity actors

File(s) Actor Notes
workspace_farm.go WorkspaceFarmActor session root; in-process control messages (activate/switch/reconcile) that are not on NATS
workspace*.go WorkspaceActor owns ws.inbox/ws.snapshot; routes input (##/@/@@/####/normal); memoizes full + layout-only snapshots; gates registry spawning to the primary workspace
tab*.go TabActor owns pipeline state; grid creation
lane*.go LaneActor column; flex weight; concurrent snapshot fan-out
pane_group.go PaneGroupActor row slot; rowFlex; stacked-pane rotation
pane*.go PaneActor PTY + *vterm.VTerm; spawns the per-pane LLMPromptExecutionActor and a child MemoryManagerActor
memory.go MemoryManagerActor per-pane child; subscribes to pane.{id}.memory.summarize, calls the LLM to summarize older turns, persists MemoryState to the rysh-memory KV bucket

Input routing (workspace_rysh.go)

graph LR
    In["MsgSubmitInput → ws.inbox"] --> Disp{prefix}
    Disp -->|"##..."| Sys["runRyshCommand
(mirrored to rysh + shell buffers)"] Disp -->|"@..."| Agent["agent/humanoid prompt"] Disp -->|"@@..."| Ctl["agent/humanoid control"] Disp -->|"####..."| Relay["relay to SOURCE pane
of a control-mode share"] Disp -->|normal| Pane["MsgPaneSubmitInput → pane.{active}.inbox"]

The normal-input fast path bypasses Tab/Lane/Group entirely (workspace_rysh.go):

if strings.HasPrefix(m.Text, "@@")   { /* humanoid/agent control */ ; return }
if strings.HasPrefix(m.Text, "@")    { /* humanoid/agent prompt   */ ; return }
if strings.HasPrefix(m.Text, "####") { /* relay ##cmd to share source pane */ ; return }
if strings.HasPrefix(m.Text, "##") && !strings.HasPrefix(m.Text, "##>") { w.runRyshCommand(...); return }
// normal:
w.pub.Send(msg.T("pane", activePaneID, "inbox"), &MsgPaneSubmitInput{...})

The ## system commands recognized by runRyshCommand include: tab, pane, lane, panegroup/pg, public/private, history/h, pipe/pipeline, share/unshare, upstream (status/my-shares/list-remote/subscribe/unsubscribe/send), snap, workspace/ws (list/create), new, cmd, rysh (new / web start|stop|status), hop, agent, humanoid, integration/int (list/enable/disable/tools/remove — Rysh Forge, §6.11), mcp (add/list/tools/reconnect/remove — external MCP servers, §6.12).

##> is a separate in-band event channel, not a ## command. It is deliberately excluded from runRyshCommand. ##>event:ai:softdev: / ##>event:sh:softdev: triggers (softdev_events.go) drive auto-advancing build phases: a softdevPrompts map keyed by language:phase injects the next phase's prompt as the pipeline progresses. See internal/pipeline (§6.13).

Structural ops (create/close/focus/resize) cascade down the tree; reads (snapshots) cascade down via NATS request/reply with concurrent lane/group fan-out (goroutines + sync.WaitGroup; per-group request timeout 2s at the lane level, 1s at the pane level). State persists to JetStream KV (ToKV/doRestoreFromKV), restored in *actor.Started. Two persistence constants: workspaceKVInterval = 2s (debounced write) and snapshotCacheTTL = 100ms.

The workspace memoizes two snapshot caches — a full snapCache and a content-free snapLayoutCache — both invalidated on persistToKV; the TUI's recurring poll requests LayoutOnly: true.

Agents vs Humanoids

Both are headless LLM actors (no PTY) with their own child LLMPromptExecutionActor, registered to a pane so their streaming output lands in that pane's chat/external buffer.

graph TD
    subgraph agent["AgentActor (agent.go)"]
        A["autonomous coding agent
subject agent.{name}.inbox"] end subgraph human["HumanoidActor (humanoid.go)"] H["agent + external channels
subject humanoid.{name}.inbox"] Ch["channel adapters
(Slack/Email/WhatsApp/Phone/Chatbot)"] H --- Ch end AR["AgentRegistryActor"] --> A HR["HumanoidRegistryActor"] --> H Ch -->|inboundLoop → MsgHumanoidInboundMessage| H H -->|routeOutboundToChannel| Ch
  • AgentActor — output routed to the first registered pane's chat buffer. Managed by AgentRegistryActor (KV-persists the agent map, restores on restart).
  • HumanoidActor — each channel adapter runs an inboundLoop goroutine republishing inbound messages; the actor builds a contextual prompt (per-thread history + LLM-summarized memory), runs the LLM, buffers streaming output, and routes the reply back to the originating channel. Output goes to the pane's external buffer. Supports email governance modes: ai (auto-reply) vs human (draft-and-confirm with the email toolbox). Tool approvals in external mode are handled via text replies.

Both registries are spawned only by the primary workspace (their subjects aren't workspace-namespaced).

Declarative skill files (agent_skillfile.go, humanoid_skillfile.go, skillpath.go)

Agents and humanoids can be defined declaratively via YAML-frontmatter SKILL.md files instead of being constructed in code. They are discovered under .rysh/agents/<name>/SKILL.md and .rysh/humanoids/<name>/SKILL.md (project-local ./.rysh preferred, else $HOME/.rysh; see ryshBaseDirs in skillpath.go). The frontmatter supplies name/system-prompt/config; humanoid frontmatter additionally carries a contacts: map for per-channel configuration. Spawning commands:

  • ##agent spawn <name> / ##agent spawn-all [dir]
  • ##humanoid spawn <name> / ##humanoid spawn-all

6.3 The TUI (internal/tui/)

Framework: charmbracelet/bubbletea (Elm-style Model-Update-View) + lipgloss + bubbles/textinput, muesli/termenv (forced TrueColor), charmbracelet/x/ansi for wrapping. The Model talks to the daemon only over NATS — it does not import internal/actors.

The Model-Update-View loop

graph TD
    Init["Init(): tea.Batch
refreshCmd + tickCmd(250ms)
+ listen* (block on NATS-fed channels)"] --> Update Update["Update(msg): big type switch"] --> Render Render["View(): header / body / footer
(lipgloss.JoinVertical)"] --> Update Update -->|WindowSizeMsg| Geom["recompute geometry, resize PTYs"] Update -->|MouseMsg| Mouse["handleMouse / forwardRawMouse"] Update -->|KeyMsg| Mode["modal state machine"]
  • Init() batches a snapshot poll, a 250ms tick, and long-lived listen* commands that block on Go channels fed by persistent NATS subscriptions (approval, pipeline output, attention, mirror-dirty, content, layout-dirty). Each re-arms after delivering.
  • Update() dispatches window/mouse/key events and snapshot/content updates.
  • View() renders header (workspace + tab strips as Powerline bubbles), body (lanes as columns of pane panels), and footer.

Modal input (tmux-style)

inputMode is a string type (type inputMode string, model.go:90), not an iota int. Its constants (model.go:93109): modeNormal, modeTab (ctrl+t), modeWorkspace (ctrl+w), modePane (ctrl+p), modeResize, modeNavigate (ctrl+space), modePrefix (ctrl+o), modeAltPPrefix (alt+p fullscreen), modeRenamePane/modeRenameTab, modeStack (ctrl+s), modeMovePane (ctrl+y), modeLayout (ctrl+l), modeApproval/modeRejectReason, modeRaw, modeRawScroll. Per-pane input is dual-mode shell vs prompt (toggled by double-Escape); Tab does readline completion in shell mode.

Init() batches refreshCmd (snapshot poll, ws.snapshot with LayoutOnly:true, 2s timeout), a 250ms tickCmd, and the long-lived listen* commands. The Model forces termenv.TrueColor; the rename input caps at 200 chars, reject-reason at 500.

In modeRaw, every keystroke is converted to raw terminal bytes (keys.go has the full ANSI/CSI table) and forwarded to the PTY — but multiplexer controls (ctrl+o/l/space/p/t/w/s/y, alt+p) are still intercepted.

The content plane (the clever part)

Snapshots carry layout only; per-pane display content is streamed (content_stream.go) and reconciled:

Mechanism Trigger / constant Purpose
STREAM wildcard subs to pane.*.output.{shell,ai,chat,rysh,email,slack,chatbot} (email/slack/chatbot → local "external") accumulate deltas into per-pane buffers
BACKFILL first visibility full MsgGetPaneSnapshot pull
RECONCILE contentReconcileInterval = 1500ms re-pull full content to heal dropped deltas
RAW VT rawFetchInterval = 50ms pull interactive panes' VT screens wholesale
post-submit postSubmitFetchDelay = 80ms quick pull right after sending input

Buffers are capped at maxPaneContentBytes = 20000 (matching the actor-side maxPaneBuffer). ai output is mirrored into both output and aiOutput (matching the actor's dual-publish). Loads from a full snapshot replace (not append) the buffer, so reconcile self-corrects. Before render, rehydrateSnapshot() merges these buffers into m.snapshot (skipping mirror-prefixed panes) so View() stays simple. Layout-dirty and pane-status events coalesce on a ~16ms timer.

Native-speed relay (relay.go)

For fullscreen interactive apps (vim/htop/Claude Code), the TUI uses tea.Exec with a PTYRelay: Bubble Tea releases the terminal, the relay calls term.MakeRaw, enters the real terminal's alt screen (\x1b[?1049h\x1b[2J\x1b[H), subscribes to pane.{id}.relay.data (channel cap 256) and .relay.exit before sending MsgRelayActivate{Cols,Rows}, and runs three goroutines: stdin→pane.{id}.rawinput, relay.data→stdout, and SIGWINCH→MsgPaneResize. The stdin goroutine intercepts 0x0f (Ctrl+O) → ErrRelayEscape and 0x0c (Ctrl+L) → ErrRelayLayout to break out. keyMsgToBytes (keys.go) is the full ANSI/CSI table (arrows with shift/ctrl modifiers CSI 1;2/1;5, F-keys, Ctrl+A–Z → 0x01–0x1a, bracketed paste \x1b[200~…201~); Ctrl+O (0x0f) is reserved as the escape hatch and never forwarded. Daemon side: PaneActor.handleRelayActivate + rawReadLoop.


6.4 Terminal emulation (internal/vterm/ + third_party/vt10x/)

  • internal/vterm/vterm.go wraps hinshun/vt10x, which go.mod replaces with the vendored third_party/vt10x/ fork that adds a scrollback ring (SetScrollbackMax, Scrollback(), ScrollbackTail(), ScrollbackEvictedTotal()).
  • Each PaneActor owns a *vterm.VTerm fed by the PTY read loop (pane_shell.go:rawReadLoop); the shell starts via creack/pty.
  • VTerm.Write pre-filters then writes, but returns the original length:
filtered := stripUnsupportedCSI(p)              // drop ESC[ >/</= (Kitty kbd, modifyOtherKeys, term-ID)
filtered = v.filterRedundantAltScreen(filtered) // vt10x XORs ModeAltScreen → drop redundant resets
_, err := v.term.Write(filtered)
v.dirty = true; v.invalidateRenderCache()
v.altScreen = v.term.Mode()&vt10x.ModeAltScreen != 0
return len(p), err

filterRedundantAltScreen tracks the fork's own authoritative inAlt state because vt10x toggles (XORs) the alt-screen bit, so a stray reset-while-not-in-alt would spuriously enter alt mode. scrollbackMaxLines = 2000.

  • Rendering: RenderANSI/RenderANSIWithCursor produce per-row ANSI strings with SGR runs (the block cursor is drawn by XOR-ing glyphReverse); memoized, cache-invalidated on write/resize.
  • Interactivity detection (the subtlest code, pane_shell.go:166222): two signals from the VTerm — (1) alt-screen (IsAltScreen) applied immediately in both directions; (2) cursor-hidden (IsCursorHidden, for inline Bubble Tea apps like Claude Code) applied immediately on entry but debounced on exit by interactiveExitDebounce = 3s (~5× Bubble Tea's 530ms blink) to absorb cursor-blink flicker. Non-interactive output is ANSI-stripped, command-echo-suppressed (a 2s-deadline CAS state holding queued command echoes), and shell-prompt-stripped before publishing to pane.{id}.output.shell.

6.5 The agent toolbox (internal/tools/)

46 unique tools register into a ToolRegistry (the frameworkToolRegistry/ToolSpec/ToolOutput — is aliased from rysh-shared/tools, but the tool implementations below are the CLI's own in internal/tools/; see §4.5). A base registry is built in internal/agentic/setup.go; per-pane it's Clone()d and NATS-dependent tools are added.

graph TD
    Base["base registry
(setup.go)"] -->|Clone| PaneReg["per-pane registry"] PaneReg --> AddNATS["+ ask_user, pane_inspect, pane_send,
agents_list, session_history,
context_store, todo"] PaneReg --> AddEmail["+ email_* (humanoid human-governed mode)"] AddNATS --> LLM["LLMPromptExecutionActor
(maxIterations ≈ 50)"]

Tool catalog

Group Tools Notable
Filesystem file_read, file_write✋, file_edit✋, multi_edit✋, apply_patch✋, ls, glob, grep, tree edits return unified diffs for preview-approval
Shell/process bash✋*, bash_background✋, bash_output, kill_shell✋, monitor✋, process_list, port_check, env_read bash approval only for dangerous patterns; env_read auto-redacts secrets
Git git_status, git_diff, git_log, git_commit
Code intel symbol_search find func/type/const/var declarations
Build/test build✋, test_run✋, lint structured Go results
Web web_search, web_fetch CLI-local implementations (internal/tools/); web_search hits Brave directly via cfg.BraveAPIKey
Workspace / multi-agent ask_user, pane_send✋, pane_inspect, agents_list, session_history, clipboard, project_notes ask_user is the user-interaction tool (publishes a question, blocks 5min)
Persistence (JetStream KV) context_store, context_recall, todo, memory_edit cross-turn KV / session task list; memory_edit writes durable project memory to RYSH.md
Agent control / meta sub_agent, list_tools, rysh_build_pipeline sub_agent now spawns a real child agent (intercepted by the orchestrator); its registered Execute is just a defensive fallback
Email (humanoid) email_list, email_read, email_draft, email_send✋, email_attach draft → attach → send (send deletes the draft)

(✋ = requires approval; ✋* = conditional). BackgroundSessionManager gives each background process a 256KB ring buffer and records exit codes.

A ToolExecutor's Spec() carries Parameters as a json.RawMessage (a JSON Schema, not a string) that is the tool definition sent to Claude. Examples:

  • bash{command (req), timeout (ms, max 600000), work_dir}. maxOutputBytes = 100KB, maxTimeout = 10min, default 30s. RequiresApproval is dynamic — first the user's BlockedCommands, then the exact dangerous list: "rm -rf /", "git push --force", "git reset --hard", "git clean -f", "> /dev/".
  • file_edit{file_path, old_string, new_string} (all required); always requires approval.
  • grep{pattern (req), path, glob, output_mode ∈ content|files_with_matches|count, context}.
  • ask_user{question (req), options[]}; no approval (it is the interaction).
  • memory_edit{action ∈ read|append|write, content}; persists durable project memory to RYSH.md (capped at 64 KB), which is auto-loaded into the system prompt each session. Only write requires approval.
  • sub_agent{task (req), context, system_prompt, allowed_tools[]}; no approval. The orchestrator (rysh-shared/agentic) intercepts the call before registry dispatch and spawns a real child OrchestratorActor with its own isolated context window, an allowed_tools whitelist (else the full parent tool set), and an optional system_prompt (else DefaultSubAgentSystemPrompt). Only the child's summary returns; nesting is depth-capped at MaxSubAgentDepth = 2. The registered tool's own Execute is an unreachable defensive fallback. See §4.2.

The base registry built in agentic/setup.go registers (in order): bash, file_read, file_edit, file_write, glob, grep, web_search, web_fetch, monitor, sub_agent, git_status, git_diff, git_log, git_commit, tree, symbol_search, process_list, port_check, env_read, test_run, lint, build, clipboard, project_notes, memory_edit, rysh_build_pipeline, ls, multi_edit, apply_patch, bash_background, bash_output, kill_shell, then list_tools last. Per-pane (CreateLLMPromptExecutionActor) clones and adds ask_user, pane_inspect, pane_send, agents_list, session_history (+ context_store, context_recall, todo if JetStream is available), re-registering list_tools last; maxIterations defaults to 50 here. Headless agents get the same minus ask_user; humanoid email mode adds the five email_* tools plus the email-governance prompt (loaded from system_email_governance.md).

Externalized agent prompts. Every agentic prompt now lives as markdown under rysh-cli/rysh-cli-agent-prompts/, embedded into the binary at build time via //go:embed (prompts.go). At startup main.go loads them into an agentic.Prompts struct and Setup.ApplyPrompts assembles the effective system prompt: system_default.md → project memory (RYSH.md) → env block (system_env_block.md, with {{cwd}}/{{os}}/{{arch}}/{{date}}/{{git_branch}}/{{git_dirty}}/{{project_type}}/{{tree}} substituted) → system_todo_guidance.md, appending system_email_governance.md only in humanoid email mode. It also pushes system_sub_agent.md and system_compaction_summarize.md (a {{transcript}} template) into rysh-shared's exported DefaultSubAgentSystemPrompt / DefaultCompactionSummarizePrompt vars. Empty/missing files fall back to in-package constants in internal/agentic/prompts.go, so the prompts are build-time artifacts, not runtime files.


6.6 Provider (internal/provider/)

  • provider.goNew(cfg) switches on ProviderName: for "claude" it picks the ClaudeAPI client when an API key is present, else the claude CLI adapter. Any other (or missing) provider name falls through to a static mock — regardless of whether an API key is set.
  • claude_api.go — direct Anthropic Messages API (single-turn, text-only) for non-agentic completions.
  • agentic_provider.go — aliases rysh-shared/provider; NewClaudeAgenticProvider defaults to claude-opus-4-5, maxTokens 8192. (See §4.4.)

6.7 External channels (internal/channels/)

The ChannelAdapter interface (Type, Start, Stop, Send, InboundCh, Status, SetReplyMode) is implemented per platform; factory.go picks the adapter by type. Wired into the humanoid system: inbound messages become agent prompts, replies go back out.

graph LR
    Ext["Slack / Email / Chatbot / WhatsApp / Phone"] --> Adapter["ChannelAdapter"]
    Adapter -->|InboundCh| H["HumanoidActor"]
    H --> LLM["LLMPromptExecutionActor
(+ email tools if human-governed)"] LLM --> H H -->|Send| Adapter Adapter --> Ext Factory["factory.NewAdapter(type, cfg)"] -.-> Adapter

The interface (adapter.go:1940): Type(), Start(ctx), Stop(), Send(ctx, OutboundMessage), InboundCh() <-chan InboundMessage, Status(), SetReplyMode(mode) ("messages" | "mentions").

Adapter Status Implementation
Slack ✅ full slack-go Socket Mode (OptionPingInterval(20s)); reconnect backoff initialBackoff = 2s doubling to maxBackoff = 60s; livenessWatchdog forces reconnect after livenessTimeout = 90s of silence (checked every 15s); mention dedup; reply modes (all vs @mentions); in-thread replies (ThreadTS→TS fallback)
Email ✅ full hand-rolled IMAP-over-TLS with IDLE (idleTimeout = 25min, RFC 2177 ≤29min) + SMTP/STARTTLS; tracks UIDNEXT and Re: threading; also backs the email_* tools
Chatbot ✅ remote polls rysh-server REST; takeover + reply
WhatsApp ⚠ stub validates creds; TODOs for Meta Cloud API
Phone ⚠ stub validates Twilio creds; TODOs for Twilio webhooks

DraftStore holds Draft{ID ("draft-N"), To, Subject, Body, InReplyTo, Attachments []Attachment, CreatedAt} keyed by a sequential id. Humanoid conversation context keeps maxConversationTurns = 20 and conversationTTL = 24h, summarizing evicted turns (MemorySummary capped at 3000 chars).

DraftStore is a thread-safe in-memory map backing the email draft/attach/send tools. Channel credentials are fetched from rysh-server (decrypted connection secrets) so tokens aren't duplicated locally.


6.8 Voice input (internal/voice/)

graph LR
    Idle["StateIdle"] -->|Start| Rec["StateRecording
(sox/rec/ffmpeg/arecord → 16kHz mono WAV)"] Rec -->|Stop| Trans["StateTranscribing"] Trans -->|"Deepgram nova-3 / OpenAI whisper-1"| Text["text → TUI prompt"] Text --> Idle

Controller (states StateIdle=0, StateRecording, StateTranscribing) is a thread-safe state machine wiring a Recorder (auto-detects rec/sox/ffmpeg/afrecord/arecord, records 16kHz mono WAV, rejects files <1024 bytes as "no audio", stops via SIGINT to flush the WAV header) and a Transcriber (default Deepgram nova-3 at api.deepgram.com/v1/listen, or OpenAI Whisper whisper-1 at api.openai.com/v1/audio/transcriptions). Transcribe always removes the temp file and resets to idle, so the TUI can drive Start/Stop while transcription runs off-thread.


6.9 Collaboration: sharing & mirroring

Rysh shares panes (and tabs/lanes/groups) to a remote upstream NATS workspace hosted by rysh-server.

graph LR
    subgraph source["Source CLI"]
        SR["ShareRegistryActor"] --> US["UpstreamShareActor"]
        US -->|"forward output / layout doc"| Up
    end
    Up["ws.{workspace}.share.{id}.*
(rysh-server NATS)"] subgraph sub["Subscriber CLI"] RSL["RemoteShareListenerActor
(pane share)"] MTL["MirrorTabListenerActor
(tab/lane/group share)"] Up --> RSL Up --> MTL RSL --> OwnerPane["owner pane (own VTerm)"] MTL --> MirrorTab["synthetic mirror tab"] end

Publishing (source): ShareRegistryActor (spawned per-workspace when upstream is enabled) owns share.registry.inbox and spawns an UpstreamShareActor per share. That actor connects to the remote NATS, forwards tracked panes' output/conversation/raw-VT, and for tab/lane/group shares runs a layout loop (layoutPublishInterval = 600ms idle / layoutInteractiveInterval = 150ms while an interactive program runs) publishing a periodic layout document to ws.{workspace}.share.{shareID}.output.layout. It heartbeats every 30s (shareHeartbeatInterval) so the server can reap stale shares. In control mode it also routes inbound commands to local actors. Per-pane restrictions (disable modes, shell allow/forbid lists) are applied here.

File browse for mobile (fs/* relay). File browsing is now always enabled for an active share: AllowFileBrowse is forced true when restrictions are published and the .fs responder (ws.{ws}.share.{shareID}.fs) is subscribed whenever a share is connected — so the server's fs.list/fs.read/fs.stat relay (doc 07, §7.8) never times out on a missing responder (which had surfaced as an opaque 503). At share time WorkspaceActor.captureSharedRoot pins a SharedRootFolder (the submitting pane's live cwd) into the share record and propagates it (MsgShareEntity.SharedRootFolderShareRegistryActorUpstreamShareActor.sharedRootFolder); resolveBrowseRoot returns this pinned root first (if it still exists) so every pane of a tab share browses the same folder. Live cwd resolution is now cross-platform: procCwd reads /proc/<pid>/cwd on Linux and shells out to lsof on macOS/BSD (a former runtime.GOOS == "linux" gate made every macOS share fall back to the daemon's home dir). The ##share confirmation reports the browse root and how it was resolved (live / startup dir / daemon launch dir).

Subscribing (mirror): ##upstream subscribe <shareID> [view|control]. A pane share → a RemoteShareListenerActor reconstructs output through its own *vterm.VTerm into a local owner pane. A tab/lane/group share → a mirror tab: a synthetic extra tab fed by a MirrorTabListenerActor, carrying source layout + per-pane scrollback, with live interactive panes streamed via raw VT and reflowed to the subscriber's width. Control-mode structural ops/input relay back to the source. Subscription limits are enforced per-workspace via internal/limits.Checker.


6.10 Embedded web UI (internal/web/)

A Gin + Gorilla WebSocket server mirroring the TUI in a browser, connected to the same NATS bus. Serves embedded Vite assets, runs a broadcast Hub, polls ws.snapshot (200ms) for non-stream clients, and offers an event-driven content plane (?stream=1) with layout-only snapshots + per-pane conversation deltas + interactive-pane VT fast-pull. handleCommand maps WS actions → typed NATS messages (pane/tab CRUD, focus/resize, submit_input, approval_response, raw_key_input, agent/humanoid/share commands).


6.11 Rysh Forge: API spec → governed agent tools (internal/forge/)

Forge is a Stainless-style pipeline that turns a machine-readable API spec into governed, agent-callable tools — and, as side artifacts, SDKs, Markdown docs, and a standalone MCP server. One ingester per source format and one generator per output feed a shared Intermediate Representation (ir.API), so adding a format or target is additive, never an N×M matrix (ir/ir.go:1).

graph LR
    Spec["API spec
OpenAPI 3.x · GraphQL introspection"] -->|ingest.OpenAPI / ingest.GraphQL| IR["ir.API
(operations, schemas, auth)"] IR -->|toolpack.Build| TP["ToolPack
(ToolDef per operation)"] TP -->|"(*ToolPack).Register + Policy"| EXP{"exposure mode"} EXP -->|"≤50 ops (static)"| ST["one ToolExecutor per op"] EXP -->|">50 ops (dynamic)"| DY["3 meta-tools:
list / get_schema / invoke"] ST & DY --> REG["shared ToolRegistry
(panes Clone() at creation)"] REG -->|Call| RT["runtime.HTTPExecutor
auth · retry · jq-lite · redact"] --> API2["upstream REST API"] IR -. gen .-> ART["docs · go/ts/py SDK · MCP server"]

Pipeline. ingest.OpenAPI / ingest.GraphQL (ingest/openapi.go:17, ingest/graphql.go:21) normalize a spec into ir.API (ir/ir.go:16); (*Operation).ToolInputSchema flattens path/query/header params + body into one JSON-Schema object (ir/ir.go:124). toolpack.Build renders each operation as a ToolDef (toolpack/toolpack.go:39); (*ToolPack).Register applies the Policy and installs ToolExecutors (toolpack/exposure.go:45). At call time runtime.HTTPExecutor.Call assembles the request, injects auth (apiKey/bearer/basic/oauth2), retries 429/5xx with backoff, and optionally trims the response (runtime/executor.go:74).

Surfacing as agent tools. The Forge Manager registers into the shared ToolRegistry each pane/agent Clone()s (manager.go:57, setup.go:161). Each tool is a standard ToolExecutor, so Forge tools flow through the same orchestrator approval/redaction path as native tools. Register/unregister is dynamic (Manager.enable retracts then re-registers; disable/remove/close unregister via ToolRegistry.Unregister).

Governance (toolpack.Policy, exposure.go:32). The central anti-"tool-explosion" lever is exposure mode: auto switches to dynamic above DefaultDynamicThreshold = 50 ops, exposing just 3 meta-tools (list_endpoints, get_endpoint_schema, invoke_endpoint) for a constant context footprint; otherwise static, one tool per op, capped at DefaultMaxTools = 200. Further controls: Tags/ReadOnly filters, per-op approval for mutating endpoints (or all under ForceApproval), an injected jq_filter arg to trim responses before they enter context, and a Redact hook. Secrets are never persisted — only env-var names are stored (store.go:18).

User-facing surface.

  • CLI rysh forge {add|generate|list|targets} (main.go:247, forge_cmd.go); generator targets rysh-toolpack, docs, mcp-server, go-sdk, ts-sdk, py-sdk.
  • In-session ##integration (alias ##int): list, enable <name>, disable <name>, tools <name>, remove <name> (workspace_integration.go).
  • On disk under <workDir>/.rysh/forge/: integrations.json index + <name>/spec.<ext> + generated artifacts in <name>/gen/<target>/. Enabled integrations are re-registered at startup by BootstrapForge (setup.go, main.go:453).

Liveness caveat. Panes clone the registry at creation, so a live ##integration enable only reaches panes/agents created afterward (startup Bootstrap reaches all).

Implemented vs aspirational. OpenAPI → tool-pack → live HTTP execution is fully implemented and tested. Gaps on halil: gRPC ingest is a stub (returns an error, ingest/grpc.go:17); GraphQL has no live runtime (docs/SDK only — live enable is hard-rejected, manager.go:112); the Redact hook is wired to nil at both call sites so live Forge output isn't redacted; basic-auth/oauth2 work in the runtime but aren't CLI-bindable (only --cred-env); pagination is detected but not emitted; jq-lite is a small path-only subset.


6.12 MCP client (consuming external MCP servers) (internal/mcp/)

internal/mcp/ makes Rysh a Model Context Protocol client: it connects to external MCP servers, discovers their tools, and registers each as a first-class agent tool. Adapted to tools.ToolExecutor (executor.go:15), MCP tools enter the same shared ToolRegistry every pane clones and flow through the existing Anthropic tool-use bridge, approval gate, and audit path unchanged. (Distinct from Forge, which generates a local MCP server from a spec — here Rysh is the consumer.)

flowchart LR
    U["##mcp add / .rysh/mcp.json"] --> M[mcp.Manager]
    M -->|stdio| S["child process
NDJSON JSON-RPC"] M -->|http| H["Streamable HTTP
JSON / SSE"] M -->|"initialize → tools/list"| REG[("shared ToolRegistry")] REG -->|cloned at pane create| P["pane agent"] P -->|tools/call| M

Configuration. A ServerDef persisted as a JSON array in .rysh/mcp.json (per-project, git-checkable, atomic writes). Two transports: stdio (spawn a child process speaking newline-delimited JSON-RPC: <cmd> [args] --env K=V) and http (Streamable HTTP: JSON-RPC POSTed to a URL, reply as application/json or SSE; --header K:V). Per-server flags: --approve (gate every call), --prefix, --max-tools (default 200).

Registration (##mcp command, actors/workspace_mcp.go): add <name> {http <url>|stdio <cmd> [args]} (validate → persist → connect → register, 60s), list, tools <name>, reconnect <name>, remove <name>. At daemon startup main.go:443BootstrapMCP (45s budget) loads .rysh/mcp.json and connects all servers concurrently before any pane clones the registry, so persisted servers' tools reach every pane; a live ##mcp add reaches panes created afterward.

Protocol (client.go, protocol.go). ProtocolVersion "2024-11-05", client rysh/0.1.0. Lifecycle: initializenotifications/initialized ack → tools/list (cursor-paginated, ≤1000 pages) → tools/call. Per-connect timeout 30s. tools/list_changed triggers re-list+re-register — stdio only (HTTP has no long-lived server→client stream, so HTTP servers don't auto-refresh).

Namespacing & approval. Registered name = sanitize(<prefix> + remoteName), prefix default "<server>_", coerced to ^[a-zA-Z0-9_-]{1,64}$, collisions suffixed _2…; description tagged [mcp:<server>]. Approval is per-server, all-or-nothing (--approve), enforced upstream in the orchestrator. A tool's isError result surfaces as ToolOutput.Error (model-recoverable); transport failure is a hard error. Validated against the rysh-mcp-samples/ reference servers via integration_samples_test.go.

Phase-0 caveats. HTTP delivers no server→client notifications (no auto-refresh); only tools are consumed (MCP resources/prompts and sampling/roots are not); non-text content (image/audio/resource) is summarized to text, not surfaced as media; no auto-reconnect/health loop (manual ##mcp reconnect); --env/--header secrets land in plaintext .rysh/mcp.json.


6.13 Supporting packages

Package Role
internal/cli cli.Client (short-lived NATS client) + commands.go (typed MsgCLI* to the workspace)
internal/upstream REST helpers to rysh-server (all Authorization: Bearer {apiKey}): FetchWorkspaceID (GET /api/server-info, 4s timeout), FetchConnectionCredentials (GET /api/workspaces/{ws}/connections/by-type/{type}/credentials, 10s), FetchRemoteShares (GET /api/workspaces/{ws}/shares/list)
internal/msg type-alias re-exports of rysh-shared/msg + ~200 CLI-specific codec tags
internal/pipeline global language:phase → prompt registry (written by rysh_build_pipeline, KV-persisted); auto-advancing build phases
internal/relay in-process pane-ID → PTY handle registry for the native-speed relay
internal/limits subscription enforcement (CheckCreate, ServerCheckCreate); no-op when unconfigured
internal/logging slog via tint; disabled by default; debug tees to a temp log file
internal/diff unified-diff computation/rendering for file-edit tool previews
internal/domain shared domain types (messages.go)
internal/config CLI configuration loading (provider keys, upstream, logging, upgrade-on-attach mode)
internal/forge Rysh Forge — ingests OpenAPI/GraphQL specs into a shared IR and generates governed agent tool-packs (live, via ##integration), SDKs, docs, and MCP servers (§6.11)
internal/mcp MCP client — connects to external MCP servers (stdio child / Streamable HTTP), initializetools/listtools/call, registers discovered tools into the shared registry; config in .rysh/mcp.json via ##mcp (§6.12)

6.14 Constants quick-reference

Constant Value Where
embedded NATS client port 24242 bus.go
web UI port 23232 workspace_rysh.go
pane output buffer cap 20000 bytes (maxPaneBuffer) render.go / TUI
scrollback rows 2000 vterm, pane_output.go
background session ring buffer 256KB background_session.go
bash output cap / max timeout 100KB / 10min tools/bash.go
TUI tick / reconcile / raw-VT 250ms / 1500ms / 50ms model.go, content_stream.go
interactive-exit debounce 3s pane_shell.go
snapshot fan-out timeout 2s (lane) / 1s (pane) lane.go, pane_group.go
workspace KV debounce / cache TTL 2s / 100ms workspace_snapshot.go
share layout loop (idle / interactive) 600ms / 150ms upstream_share.go
per-pane agent maxIterations 50 agentic/setup.go
conversation turn cap / TTL (humanoid) 20 / 24h humanoid.go
per-pane memory KV bucket rysh-memory (not session-namespaced) actors/memory.go
durable project-memory cap (RYSH.md) 64KB tools/memory_edit.go
sub-agent depth cap / default iters / timeout 2 / 25 / 15min rysh-shared/agentic/sub_agent.go
MCP connect timeout / max tools / page cap 30s / 200 / 1000 internal/mcp/manager.go, client.go
MCP stdio line cap / HTTP body cap 8 MiB / 16 MiB internal/mcp/transport_*.go
Forge dynamic-exposure threshold / max tools 50 ops / 200 internal/forge/toolpack/exposure.go

6.15 Caveats

  • sub_agent spawns a real depth-limited (MaxSubAgentDepth = 2) child agent — the orchestrator intercepts the call, so the registered tool's Execute is an unreachable defensive fallback.
  • WhatsApp and Phone channel adapters are stubs (validation + TODOs); Slack, Email, Chatbot are fully implemented.
  • The provider now supports both streaming (SSE) and non-streaming paths, with in-provider retry/backoff and optional model fallback (implementation in rysh-shared; see §4.4).
  • Rysh Forge gRPC ingest is a stub and GraphQL has no live runtime; Forge response redaction is wired to nil (not active). The MCP client is "Phase 0" (tools only; HTTP transport gets no auto-refresh).
  • The wire protocol is JSON-in-envelope, not protobuf.
  • The orchestrator/manager actors, the bridge, and the publisher/codec are type-aliased imports from rysh-shared — their implementations live outside this sub-project.