AdmiralsBridge is the command center for StreamRift's entire operation. Built on HM0 — the M3 Ultra with 96GB — it is the portal between the Admiral and everything his empire touches. Not just battle-ready, but beautiful. Not just functional, but transferable. This is the seat that will one day be handed to the next in command.
In 2020, StreamRift imagined a framework that could launch 1,000 companies — so fathers could provide for their families and disabled women wouldn't live in a society that treats them as rodents. That vision wasn't a business plan. It was a conviction that building is the most moral act an engineer can perform.
"The profound flex of all engineering isn't just doing the thing, but doing it elegant and stylishly. You have bandwidth for the nice touches. It's not just bending time on what is possible, but it's inspiring just to engage with it."
Not a thousand half-starts. A thousand real, shipped systems. Each one pushes into a domain not yet mastered, solves a problem that requires learning something new, and feeds knowledge back into the portfolio. The portfolio is the product.
Real-time prediction market terminal for high-frequency sports bettors with $50-100k bankrolls. Live at kt.gbuff.com.
Productized infrastructure knowledge into a commercial hosting offering on sovereign American hardware.
Fearless experimentation sandbox where discoveries inform production systems.
Coordinating AI agents at scale across the entire portfolio.
One person with AI as force multiplier. The COO model isn't a gimmick — it's the architecture that makes it possible for one engineer to run a portfolio that would normally require a team. Structure beats intelligence. Session continuity is everything.
NX-03 is a roaming piece of hardware. It's fun, it's mobile, but nothing about it undoes the fact that it's not a base station. HM0 — the M3 Ultra with 96 gigs — is the Admiral's office. This is where we observe and execute. The portal between StreamRift and everything his empire touches.
And at some point, this seat transfers to Brendan. It needs to be so well-architected that the transfer is a key turn, not a knowledge dump.
HM0 is not just the strongest machine — it is the only machine that should hold the unified state of the entire operation. Star topology, SSH transport, JSON state.
fleet-state.json is a flat file. For 11 nodes, this is correct. Revisit at 50+.fleet-ctl status # Full fleet health snapshot fleet-ctl ping # Quick heartbeat (<2s, parallel) fleet-ctl dispatch <node> <cmd> # Run command on remote node fleet-ctl deploy <project> <node> # Push project to target fleet-ctl sync <project> # Git sync across mirrors fleet-ctl logs [node] [--tail] # Aggregated or per-node logs fleet-ctl bench <node> <model> # Inference benchmark fleet-ctl ollama <node> <action> # Proxy Ollama commands
A launchd plist runs fleet-ctl ping every 60 seconds. Checks SSH reachability,
Ollama status, system metrics, and WHM health. Writes to fleet-state.json.
Alerts drop to .hand/mail/ for Forge visibility on next boot.
The boot sequence gains Step 0: read fleet-state.json, report fleet health before project status.
Say "run it" and Forge reports the fleet alongside everything else.
An operational cockpit, not a monitoring dashboard. The single browser tab StreamRift opens when he sits at HM0. Answers "what's happening across my entire operation?" in under 15 seconds.
All machines, health, load, current tasks, inference status.
All projects, status, last activity, lifecycle stage, health.
Active Claude Code sessions across machines. Quick-launch.
Model loading, fleet-wide inference allocation, benchmarks.
Build/deploy pipelines, Smithy status, CI results.
WHM servers, accounts, disk, bandwidth, site health.
Disk usage across fleet, backup status, model storage.
Model benchmarking arena, head-to-head comparisons.
Strategic planning view, decision tracking, roadmaps.
React 18 + Vite + Tailwind + Express + WebSocket. Zero new technology — identical to G1XL and C2. Absorbs existing C2 fleet monitoring and Launchpad project discovery into a unified command surface.
Deep navy-dark (#0a0e14), teal primary accent, monospace data, flat panels
with 1px borders, 150ms transitions. No decoration without function. Information-dense
but not cluttered. The name "Admiral's Bridge" sets the tone.
Two Forges. A federated MCP mesh. Personas decoupled from models. This is how Forge becomes Fleet Admiral — not just one Claude session on one project, but the orchestration layer across the entire fleet.
Portfolio-wide coordinator. Dispatches work across the fleet. Holds fleet context. Lives in AdmiralsBridge repo.
Project-scoped executor. Writes code, runs tests, ships. Lives in any project repo. Reports to Admiral via mail.
mcp-admiral on HM0 is the single MCP tool surface that Claude Code calls.
Admiral routes to satellite HTTP services on NX-03, NX-04, and H0. Satellites operate
independently if Admiral is down. Claude never talks to satellites directly.
Gary Jr, Quinn Jr, Oscar — these are stable identities with defined capabilities.
Whether Gary Jr runs gemma3:1b or gemma4:2b is a YAML config swap.
Model upgrades never break dispatch logic. Personas are the contract, models are the implementation.
| Layer | Content | Loaded When |
|---|---|---|
| L0 | Portfolio summaries (~100 lines) | Admiral boot |
| L1 | Project-specific context | On demand (hail) |
| L2 | Source code, deep analysis | Deep dive |
| L3 | Mechanical work | Delegated, never enters Admiral |
At 100+ projects, category-level Sub-Admirals (Trading Admiral, Infra Admiral, Creative Admiral) sit between Forge Admiral and project-level Roaming instances. Each is a Claude session with category-scoped context.
HM0 audited live. 96GB, Ollama running 22 models, Postgres with live trading data, 14 SSH keys. Critical gaps: no backups, no monitoring, fleet nodes going offline undetected.
| Port | Service | Status |
|---|---|---|
| 11434 | Ollama | Running |
| 5432 | PostgreSQL | Running |
| 3100 | MCP Admiral | Planned |
| 3200 | Fleet Gateway | Planned |
| 5300 | Bridge Dashboard | Planned |
| 6379 | Redis | Planned |
| Allocation | GB | Purpose |
|---|---|---|
| Inference | 60 | Model loading, embeddings, on-demand heavy models |
| Development | 16 | IDE, Claude Code, build tools, browsers |
| OS + System | 12 | macOS, system processes |
| Services | 3 | Postgres, Redis, MCP, fleet services |
| Reserve | 5 | Headroom for spikes |
The context window is where you think; the archive is what you know. Three-tier memory turns "AI that helps with one project" into "genuine second brain for the portfolio."
MEMORY.md, reports, directives, ADRs. Loaded on boot. Updated on wrap. What exists today, made useful.
Session log index (all sessions, all projects), decision registry, pattern library, operator memory. YAML on HM0.
SQLite + nomic-embed-text embeddings + sqlite-vec. Semantic search across all reports, ADRs, patterns. ~400MB idle.
A 2-hour session (50K tokens) compresses through four stages:
| Stage | Output | Size |
|---|---|---|
| Session transcript | ~50K tokens | Full conversation |
| Report | ~500 lines | Structured debrief |
| MEMORY.md entry | ~5 lines | Hot state for next boot |
| Session log entry | ~8 lines | Indexed for fleet search |
Compression ratio: ~1000:1. This is how 1,000 projects stay manageable.
Natural language queries across the fleet's memory. "Why did we choose X for project Y 6 months ago?" returns ranked results in under 100ms. Powered by local embeddings on HM0 — no API calls, no data leaves the machine.
The difference between "here's my password" and "here's a system designed for succession" is that the password gives access, but the system gives competence. The operator becomes a variable in the system, not a hardcoded constant.
.hand/ACTIVE_OPERATOR contains one word. The boot sequence loads the
corresponding profile from .hand/operators/. Switching operators is a
deliberate act, not a guess.
| Tier | Name | What They Can Do |
|---|---|---|
| T1 | Observer | Read-only. /sitrep, /scout. Bridge explains everything. |
| T2 | Apprentice | Commit to dev. /forge-mode, /crucible. No production deploy. |
| T3 | Operator | Full project execution. Deploy, ADRs, directives. |
| T4 | Admiral | Full authority. Can modify the Bridge itself. |
Forge adjusts across six axes per operator:
Terse explanations. High autonomy. Light guardrails. Sandwich format. Full portfolio access.
Thorough explanations. Low autonomy. Heavy guardrails. Tutorial format. Current-project only.
Interactive 4-phase onboarding skill. Phase 1: Orientation (interview, tour, comprehension check). Phase 2: Supervised practice. Phase 3: Independent operation. Phase 4: Full transfer. The Bridge teaches itself to the new operator.
Six interconnected systems that take the portfolio from 20 projects to 1,000. File-based until files don't scale. Commands, not UI.
portfolio.json — single source of truth. Every project's name, stage, tech stack, dependencies, health. ~500KB at 1,000 projects.
/new-project — zero-to-running in one command. Template catalog for web apps, CLI tools, API services, infrastructure.
/bridge — 30-second health view. Exception-based at 100+. Category rollup at 1,000.
graph.json — what breaks when something changes. Cross-project dependency tracking.
/learn — cross-project learning. Patterns graduate when adopted across 3+ projects, then bake into scaffold templates.
/focus — manages the scarcest resource. Weekly allocation. Scoring prevents critical dependencies from rotting.
idea → scaffolding → active → maintenance → stable → archived → sunset
At 1,000 projects, only 5-15 are in active development at any time. The Bridge surfaces the right tooling for each stage. The rest are either running autonomously or waiting.
96GB unified memory unlocks the Specialist+ tier. Models that don't fit on any other machine in the fleet run here. 60-70% of Claude API work can stay local, sovereign, free.
| Model | Params | Memory | Speed | Role |
|---|---|---|---|---|
| qwen2.5-coder:32b | 32B | ~23GB | 25-35 tok/s | Code generation workhorse |
| gemma3:12b | 12B | ~8GB | 50-80 tok/s | Fast utility, summarization |
| mxbai-embed-large | 137M | ~1.2GB | N/A | Embeddings for fleet RAG |
| Model | Params | Memory | Speed | Use Case |
|---|---|---|---|---|
| deepseek-r1:70b | 70B | ~43GB | 12-18 tok/s | Chain-of-thought reasoning |
| llama3.3:70b | 70B | ~40GB | 12-18 tok/s | General intelligence |
| qwen3-coder:30b-a3b | 30B MoE | ~18-20GB | 40-68 tok/s | Fast MoE coding |
| minicpm-v:8b | 8B | ~5-6GB | 30-50 tok/s | Vision / OCR |
| Tier | Models | Speed | Where |
|---|---|---|---|
| Nano | gemma3:1b, phi-3-mini | 89+ tok/s | NX-03, any |
| Worker | gemma3:4-8b, qwen3:1.7-8b | 10-57 tok/s | NX-03, H0 |
| Specialist | codestral, deepseek-coder:33b | 5-15 tok/s | H0, HM0 |
| Specialist+ | qwen2.5-coder:32b, 70B models | 12-35 tok/s | HM0 only |
| Architect | Claude Opus/Sonnet | API | Cloud |
The audit found the Bridge running with shields down. FileVault off. Firewall open. Zero backups. This is the foundation that must be laid before anything else.
Critical Finding: If HM0's SSD fails right now, 14 SSH keys, the streamrift trading database, and 373GB of Ollama models vanish. FileVault is disabled — a stolen Mac means full access to everything.
| # | Layer | Status | Implementation |
|---|---|---|---|
| 1 | Disk Encryption | Not enabled | FileVault, recovery key in safe |
| 2 | Network Hardening | Wide open | Firewall + stealth, SSH key-only, Ollama localhost |
| 3 | Secret Management | No system | age + sops, encrypted YAML, pre-commit hooks |
| 4 | Overlay Network | Flat LAN | Tailscale mesh, per-role ACLs |
| 5 | Backup & DR | Nothing | restic + Backblaze B2 (daily), local weekly |
| 6 | Data Sovereignty | Implicit | Classification (CRITICAL/HIGH/MEDIUM), rules |
| 7 | Audit & Hygiene | Minimal | SSH logging, GPG commits, update schedule |
Runbook at .hand/runbooks/security-hardening-day0.md. Five steps:
FileVault, macOS firewall + stealth mode, SSH key-only auth, fleet pf rules, Ollama
localhost binding. Each step has verification commands and rollback procedures.
Beyond the architecture — what it feels like to sit at the Bridge. The boot-up ritual. The 30-second awareness cascade. The command vocabulary. Engineering as art applied to the meta-tool itself.
Terminal opens. PS1 prompt shows machine identity, branch, fleet count. Tmux layout materializes.
Background fleet health check. SSH pings, Ollama status, alert collection. Cached before you ask.
Data already cached. "run it" produces fleet + project + recommendation instantly. You never wait.
run it, sitrep, forge, crucible, anvil, ship it, hammer time, wrap it, make it so, scout, architect, war room
hail <project> — switch focus. fleet status — full report. battle stations — focus mode. shore leave — graceful shutdown.
1. You will never start cold. The Bridge is always aware before you ask.
2. You will never fight the process. The Smithy handles workflow. You handle creation.
3. You will never outgrow the tool. Twenty projects or a thousand. The limitation is always external.
The gear room. Head down to see how much firepower the fleet has and how to deploy it. 22 models, 373GB on disk, 96GB of unified memory, and the question: how many parallel Forge instances can we run?
| Model | Params | Storage | Gen tok/s | Tier | Verdict |
|---|---|---|---|---|---|
| gemma3:1b | 1B | 0.8GB | 177-191 | Nano | Screaming fast. Nano king. |
| gemma3:4b | 4B | 3.1GB | 100-112 | Nano+ | Best speed:quality ratio |
| gpt-oss:20b | 21B | 12.8GB | 99-102 | Worker* | MXFP4 miracle. Best worker. |
| nemotron-3-nano | 31.6B MoE | 22.6GB | 46-55 | Worker+ | MoE speed, huge knowledge |
| qwen3:8b | 8B | 4.9GB | 47-52 | Worker | Solid, reliable |
| glm-4.7-flash | 29.9B MoE | 17.7GB | 14 | Specialist | Verbose, slow for size |
| qwen2.5-coder:32b | 32.8B | 18.5GB | 9.8-11.5 | Specialist | Best code quality, slow |
| qwen3-coder:30b | 30.5B MoE | 17.3GB | 8.6 | Specialist | Disappointing speed |
The gpt-oss:20b revelation: MXFP4 quantization pushes 21B parameters at 102 tok/s — nearly matching gemma3:4b speed with dramatically better quality. This is the fleet's best worker model.
You do NOT need to load a model N times to serve N agents. Ollama's num_parallel
lets one loaded model serve multiple concurrent requests. The RAM cost for "6 agents using qwen3:8b"
is ~7.9GB (model + 6x KV cache), not 29.4GB. HM0 is running with factory defaults —
that single config change unlocks everything.
gemma3:1b p=8. Maximum parallelism for mechanical tasks. File ops, templates, import updates.
gemma3:1b p=4 + gpt-oss:20b p=2. Fast nano workers + quality mid-tier.
gpt-oss:20b p=2 + qwen3:8b p=2 + gemma3:1b p=2. Recommended default.
qwen2.5-coder:32b p=2 + gemma3:1b p=4. Quality code gen + fast workers.
llama3.3:70b p=1 + gemma3:1b p=1. Maximum reasoning + a runner.
gpt-oss:120b p=1. The 117B behemoth. Full power, single focus.
launchctl setenv OLLAMA_MAX_LOADED_MODELS 4 launchctl setenv OLLAMA_NUM_PARALLEL 4 launchctl setenv OLLAMA_KEEP_ALIVE "-1" launchctl setenv OLLAMA_FLASH_ATTENTION 1 # Then restart Ollama
Six suites. 24 test prompts. Automated scoring against regex patterns. Speed + quality combined into a single fitness score, weighted by task type.
| Suite | Tests | Quality Weight | Measures |
|---|---|---|---|
| CODE_GEN | 5 | 70% | Write correct code from spec |
| CODE_REVIEW | 4 | 85% | Find real bugs, not noise |
| FILE_OPS | 3 | 60% | Mechanical precision |
| SUMMARIZE | 3 | 75% | Extract facts, don't hallucinate |
| REASONING | 3 | 90% | Step-by-step logic |
| CREATIVE | 3 | 65% | Structured professional prose |
After ollama pull, the auto-benchmark trigger runs all suites immediately.
Historical results stored in JSONL. Leaderboard generated on demand. Fleet-wide comparison
shows same model across different hardware.
From empty repo to full fleet command in 10 weeks. Security first. Then eyes. Then everything else.
P0 — Today: Run the Day 0 security hardening runbook. FileVault + firewall + SSH hardening + Ollama binding. Under 1 hour. Everything else builds on this foundation.
FileVault, firewall, SSH key-only, Ollama localhost, Time Machine. Under 1 hour.
fleet-ctl CLI, node registry, SSH multiplexing, heartbeat daemon, fleet-state.json, /run-it fleet integration.
Remote execution, cross-machine deployment, git sync, post-Anvil fleet sync hook.
Ollama MCP server, fleet-aware routing, inference dispatch, always-on model trio.
Portfolio memory index, session log, local embeddings, SQLite-vec, fleet_recall MCP tool.
React cockpit absorbing C2, fleet + portfolio + sessions panels. The visual Bridge.
| Milestone | Capability | Week |
|---|---|---|
| "run it" shows fleet | Fleet health in every boot | 2 |
| Dispatch from Bridge | Run commands on any machine | 4 |
| Local inference routing | Forge delegates to fleet models | 6 |
| Semantic fleet memory | "Why did we do X?" answered in 100ms | 8 |
| Visual cockpit | Full dashboard in browser | 10 |
| Brendan onboarding | /bridge-training ready | 10 |
Ten ForgeStorm sessions. Six architectural layers. Eighteen documents. One empty repo that is now a blueprint for the seat of an empire.
The Admiral's Bridge is where one engineer + AI punches so far out of his weight class that he surprises even himself. Not just because of what it can do — but because of how it feels to sit in that chair.
The build starts with P0. One hour. Shields up.