Overview
Strategy
Product Phases
Signal (v1)
Architecture
Delivery
Dev
Runtime Choice
Hill-Climb (AA/AR)
Ops & Next
Stakeholders
GitHub LLM Wiki
Organized AI · MatchBox

MatchBox — strike the match between founders and VCs

A matching layer that sits on top of a partner fund's existing deal flow. Listens to VC-founder calls, scores sentiment + fit, surfaces coaching notes for the founder and ratings for the VC, and turns high-confidence pairings into portable signal. Not another CRM — the engine that makes the next-best founder obvious.

v1 Signal · shipping
v2 Matching · planned
v3 Founder Prep · planned
Forked from ClawBox (Tauri + Bun + React 18)
Forbes-first rollout
Revenue model in negotiation

Product thesis

The best founder–VC matches are obvious in the first 20 minutes of a call. MatchBox makes that signal legible, portable, and actionable — and, critically, durable and confidential. Memory is the cornerstone of the matching problem; the non-obvious connections between funds and founders only surface once a deep corpus exists. Transcripts are PII — so every analyzer call runs against a per-fund fine-tuned local Gemma on claws-mac-mini. Nothing sensitive crosses the Pi-harness boundary. See Runtime Choice.

Four moves per call

listens

Transcribes every VC-founder call and scores sentiment, fit, and red flags. Captures training data for v2 matching.

matches

Pairs founders with the right fund based on call signal plus intake data, not just thesis keywords. (v2 deliverable.)

coaches

Returns founder-ready coaching notes and VC-ready call ratings after each conversation. Closes the feedback loop.

connects

High-confidence deals get shared across a trusted VC network as the subscriber base grows. Network depth compounds.

At-a-glance flow

end-to-end · dual runtime from v1 [Airtable intake] founders sign up sync VC call ──▶ transcript ──▶ SignalView ──▶ POST /api/signal/calls ┌────────────────────┼────────────────────┐ [OpenClaw] [Hermes] [local store] analyzer memory + delivery CallSignal[] one-shot JSON Honcho · skills FundProfile[] ├──▶ Slack CallSignal ├──▶ WhatsApp ├──▶ email └────────────────────┴──▶ nightly digest
Seed node
Crowley Capital is the seed node for the curated VC network. Partner contact: Jon Forbes.

At a glance

KeyValue
RepoOrganized-AI/MatchBox (private · MIT)
Default branchclaude/founder-fund-matching-urzts
Base projectForked from ClawBox (still reports clawbox in package.json)
Shipping surfaceSignal · src/components/SignalView.tsx
Founder intakeAirtable form
Strategy callGranola notes
Phase 01 · Strategy

Strategy

Upgrade a working 7/10 VC workflow into the 10/10 industry benchmark by wiring MatchBox into a partner fund's existing process. No forced platform migration.

Forbes-first rollout

Primary · Jon ForbesProven software funding pipeline · wants ambiguous-industry expansion
Non-exclusiveForbes is the case study; same system offered to other funds once proven

Why Forbes first

  • MatchBox slots in as the matching + signal layer on top of existing deal flow — no CRM migration.
  • Forbes's proven pipeline + infrastructure means integration is additive, not invasive.
  • Ambiguous-industry expansion is the explicit thesis he wants — matches MatchBox's core capability.
  • A reference customer earlier means the subscription tier is easier to price on real outcomes.

Revenue model — in negotiation

Jake meets Jon Forbes this Friday to lock compensation terms before broader rollout. Options on the table:

  • Revenue share on existing portfolio companies uplifted by MatchBox.
  • Deal participation in successful post-implementation investments.
  • Subscription for other funds accessing the MatchBox network.
Open question
  • Terms are not locked yet. All downstream rollout plans depend on the Friday meeting landing.
Phase 02 · Roadmap

Product Phases

Three product versions, compounding value — signal captures the training data for matching, matching's ranking signals feed founder prep.

v1 · Signal · shipping

Turns each VC-founder call into structured signal. Transcribe → sentiment + fit + red flags → coaching notes for the founder + ratings for the VC. Captures training data for v2.

v2 · Matching · next

High-confidence founder ↔ fund pairings. Cross-network deal sharing between subscribed funds. Match quality improves fund-over-fund as the data pool grows.

v3 · Founder Prep · later

AI avatar conducts the initial founder interview. Automated screening before the VC meeting. Due-diligence layer that cuts fraudulent claims and wasted partner time.

phases compound · each one produces the training input for the next [v1 · SIGNAL] [v2 · MATCHING] [v3 · FOUNDER PREP] transcribe founder ↔ fund pairings AI avatar interview sentiment · fit · flags cross-network deal share automated screening coaching notes match quality compounds due-diligence layer every CallSignal ranked matches + reasons founder-side facts writes to Hermes memory inform next prep cycle (avatar-interview output) │ │ └──────────────────▶ HERMES HONCHO · fact graph ◀──────────────────┘ fund · founder · sector · stage · red-flag patterns · coaching tone lives on claws-mac-mini at ~/.hermes/memories/

v1 surface (today)

  • src/components/SignalView.tsx — the operator UI.
  • src/types/signal.tsFundProfile · CallSignal · SentimentScores · RedFlagHit · AirtableConfigView · AnalyzeCallInput.
  • src/services/signal.ts — fetch client against the local backend at http://127.0.0.1:13000.
  • src/store/signal.ts — Zustand state: funds, calls, Airtable config, loading/syncing/analyzing flags.
  • internal/routes/signal.ts — Hono routes (Airtable config, sync, funds CRUD, analyzer).
  • internal/signal/{index,airtable,analyzer}.ts — store + Airtable sync + LLM analyzer.

What Signal is not

  • Not a recording tool — works off transcripts you already have.
  • Not a standalone SaaS — it's an internal operator surface plus agent delivery into the VC's existing tools.
  • Not yet the matching engine — v1 captures the training signal for v2.
Phase 03 · Signal

Signal — the v1 shipping surface

Signal turns a call into a structured CallSignal: rating, sentiment scores, red flags with quoted evidence, fact-finding prompts, and coaching notes for the VC.

Data model

interface FundProfile { id, name, thesis, stage, checkSize, sectors[], redFlags, notes source: 'manual' | 'airtable' airtableRecordId?, createdAtMs, updatedAtMs } interface CallSignal { id, fundId, fundName, founderName, callDate transcriptExcerpt, rating (0-10) sentiment: { enthusiasm, thesisAlignment, concern } // each 0-10 redFlags: [{ quote, reason }] factFindingPrompts: string[] coachingForVc, summary model?, createdAtMs } interface AirtableConfigView { apiKey (masked), hasApiKey, baseId, tableName, viewName }

Backend routes — internal/routes/signal.ts

Method + pathRole
GET /api/signal/airtableCurrent Airtable config (API key masked to last 4)
PUT /api/signal/airtablePatch config · supports clearApiKey
POST /api/signal/airtable/syncPull FundProfile rows from Airtable
GET · POST · PUT · DELETE /api/signal/fundsManual fund CRUD (coexists with Airtable-sourced rows)
GET · POST · DELETE /api/signal/callsCall list + analyze + delete

Per-call flow

one POST → one CallSignal + two Hermes side-effects SignalView ──▶ POST /api/signal/calls { fundId, founderName, callDate, transcript } internal/routes/signal.ts analyzeTranscript(fund, transcript) internal/signal/analyzer.ts callOpenclawDefaultModelChatCompletion() OpenClaw gateway parse JSON · clampScore · parseRedFlags(max 8) · parsePrompts(max 8) CallSignal { rating · sentiment · redFlags · factFindingPrompts · coachingForVc · summary · transcriptExcerpt ≤4000 } ┌─────────────────────────────────┼─────────────────────────────────┐ fire-and-forget persist in fire-and-forget Hermes memory write internal/signal/index Hermes delivery (signal-memory.ts) (delivery/hermes.ts) Honcho facts keyed by 200 { call: CallSignal } Slack · WhatsApp · fund · founder · sector email · digest zustand store ──▶ SignalView re-render

Analyzer — internal/signal/analyzer.ts

Builds the prompt from the FundProfile + transcript (truncated to MAX_EXCERPT_CHARS = 20_000), calls callOpenclawDefaultModelChatCompletion, validates the JSON response, and produces an AnalyzeResult that gets wrapped into a persisted CallSignal.

  • Score clamping — each numeric score rounded to 0.1 and bounded to [0, 10].
  • Red flags cap — max 8 per call, each carrying a literal quote + reason.
  • Fact-finding prompts — max 8; the next-call follow-up questions the VC should ask.
  • Transcript truncation is explicit — appends …[truncated N chars] so the model knows.
Why Airtable
The founder intake form is already on Airtable — Jake runs the top of funnel there. Signal syncs fund profiles from Airtable rather than inventing a new schema. Base id + table name + view are all config.
Phase 04 · System

Architecture

MatchBox inherits the ClawBox three-process split: Tauri shell · React 18 SPA · Bun/Hono backend. Signal adds one route module, one store, and one analyzer over that base.

Processes [Tauri v2 shell · Rust] src-tauri/ OS integration · autoupdate [React 18 + Vite SPA] src/ Zustand · Tailwind · en / zh i18n fetch http://127.0.0.1:13000 [Bun / Hono backend] internal/ routes incl. signal.ts ├──▶ Airtable fund intake sync WebSocket RPC [OpenClaw Gateway] http://127.0.0.1:18789/v1 [LLM provider] via callOpenclawDefaultModelChatCompletion
dual runtime · local Gemma for PII-bearing analysis · Hermes for memory + delivery · OpenClaw non-PII only Bun / Hono backend (127.0.0.1:13000) ┌──────────────────────────┼──────────────────────────┐ [local Gemma · MLX] [Hermes Gateway] [OpenClaw Gateway] mlx_lm.server :8080 ai.hermes.gateway :18789/v1 fund-<id>/ LoRA adapters claws-mac-mini non-PII tasks only PII-sensitive analyzer memory + delivery title suggestions, settings probes ├──▶ Honcho facts (~/.hermes/memories/) structured JSON ├──▶ Slack · WhatsApp · email · digest analyzer output └──▶ metadata-only delivery payloads CallSignal persisted locally └── fire-and-forget write ◀── local analyzer output degradation mode — if Hermes is unreachable, memory + delivery skip silently; the analyzer path and the SignalView UX are unaffected. PII invariant — no transcript crosses the Pi-harness boundary. Lint-gated in code.

Repo layout

. ├── src/ │ ├── components/ │ │ ├── SignalView.tsx # v1 operator UI │ │ ├── ChatView · OnboardView · SettingsView · … # inherited from ClawBox │ │ └── chat/ · layout/ · plugins/ · sidebar/ · skills/ · soul/ · ui/ │ ├── services/signal.ts # fetch client │ ├── store/signal.ts # zustand │ └── types/signal.ts # shared TS types ├── internal/ │ ├── routes/signal.ts # Hono routes │ ├── signal/ │ │ ├── index.ts # store (funds + calls + airtable cfg) │ │ ├── airtable.ts # syncFundsFromAirtable │ │ └── analyzer.ts # analyzeTranscript via OpenClaw default model │ ├── providers/openclaw-completions.ts # model bridge │ ├── channels · onboard · skills · titles · tool-call-summaries │ └── routes/{agents,channels,chat,config,cron,models,onboard,sessions,skills,soul,titles,tools}.ts ├── src-tauri/ # Tauri v2 Rust shell ├── scripts/ # mock-gateway · sync-version · sign-win · … ├── docs/ # dependency-policy · openclaw-compatibility · releases · releasing ├── public/ · test/ · index.html └── package.json (still named "clawbox" · v2026.3.17)

Inherited from ClawBox

  • All 12 backend routes (agents · channels · chat · config · cron · models · onboard · sessions · skills · soul · titles · tools).
  • All non-Signal frontend views (Chat · Cron · Onboard · Plugins · Settings · Skills · Soul · Startup · etc.).
  • Onboarding wizard, Gateway-restart banner, language toggle, theme toggle, i18n en/zh.
  • Scripts: mock-gateway.mjs, smoke-backend, sync-version, Windows signing chain, macOS .dmg build.
  • See the ClawBox guide and ClawBox wiki for the base surface.

What's new versus ClawBox

  • SignalView.tsx, signal.ts type/service/store trio, routes/signal.ts, internal/signal/*.
  • MatchBox branding (README, banner.png, logo SVG), founder-intake Airtable hookup.
Phase 05 · Delivery model

Agent, not platform

MatchBox ships as an agent. Each VC gets a custom GPT trained on their own process, reachable where they already work — not as a forced migration to a new app.

Surfaces

Slack

Coaching notes + call ratings posted directly into the fund's working channel.

WhatsApp

Same signal, delivered where the partner already texts with founders.

Email

Follow-up + next-call prompts surfaced alongside the VC's existing thread.

Airtable form

Founder intake — the signup link. Rows sync into MatchBox's fund store.

Delivery fanout

CallSignal → Hermes gateway → where partner funds actually see it CallSignal internal/delivery/hermes.ts Hermes gateway (claws-mac-mini) ai.hermes.gateway · launchd-supervised ┌────────────┬────────────┬────┼────────────┬──────────────────┐ Slack WhatsApp Telegram email SMS cron digest Socket Mode Business API bot gateway SMTP/Gmail MCP provider bridge nightly → Crowley @claude_code_ Capital channel slack Hermes cron scheduler 0 9 * * * · per-run cost cap · pause/resume
Desktop shell is optional
The Tauri desktop app in this repo is the internal operator surface. Partner funds don't have to adopt it to get value — the agent can run headlessly and reach them through the channels above.

Network effects

  • A curated VC network grows as more funds subscribe.
  • Cross-fund deal sharing for premium members (v2).
  • Crowley Capital seeds network depth.
  • Closes the industry due-diligence gap: fewer fraudulent claims, better fit, less wasted time on poor matches.
Phase 06 · Running from source

Dev

Engine-identical to ClawBox. Same npm ci → npm run dev loop.

Install + run

npm ci npm run dev # frontend (Vite :14200) + backend (Bun :13000) npm run tauri:dev # desktop shell

OpenClaw for the analyzer

npm install -g openclaw@latest openclaw gateway run --dev --auth none --bind loopback --port 18789

Build + verify

npm run build:frontend npm run build:backend cargo check --manifest-path src-tauri/Cargo.toml

Repo hygiene

npm run scan:repo npm run audit:licenses npm run audit:deps npm run smoke:backend # against mock-gateway.mjs

Branch note

Default branch today: claude/founder-fund-matching-urzts. No main at repo root yet — signal + matching work land on topic branches.

Phase 07 · Runtime choice · memory-first

OpenClaw + Hermes — dual runtime from v1

Memory is the cornerstone of MatchBox. Matching funds to founders requires connections that only surface after the corpus is deep enough — unforeseen pattern matches across sectors, stages, and partner appetites. That means Hermes can't wait until v3. It lands alongside OpenClaw in v1, as the memory substrate.

The only real coupling

MatchBox calls callOpenclawDefaultModelChatCompletion in internal/providers/openclaw-completions.ts. That's the sole tight link to OpenClaw. Everything else — the Tauri shell, React frontend, Bun backend, SignalView, the Airtable sync, the analyzer's validation layer — is provider-agnostic. A swap is a provider-module rewrite; an addition (what we're doing) is even cheaper.

The strategic bet
Memory-as-cornerstone. A matching engine without durable cross-call memory is just a sentiment scorer. Founder ↔ fund fit depends on patterns that only emerge over time — "fund X likes early fintech but balks at regulatory risk," "founder Y's second call went better than their first," "these three founders all flagged the same market signal." None of that is visible in a single call. All of it is visible if we write to Hermes memory from call one.

Dual-runtime plan (adopted)

OpenClaw · analyzer

Keeps doing what it's best at — one-shot structured-JSON analysis of a transcript. Fast, low-latency, mock-testable, Forbes-demo safe. No changes.

Hermes · memory + delivery

Writes every CallSignal into Honcho/Mem0 keyed by fund + founder + sector + stage. Queries return cross-call pattern matches. Same runtime also handles Slack/WhatsApp/email delivery.

Hermes pros

  • Platform delivery is native. The "agent, not platform" drop into Slack / WhatsApp / email / Telegram is Hermes's core design. MatchBox's stated delivery model maps 1:1.
  • Already running on claws-mac-mini. Reuse the Pi harness — inherit the launchd supervision, Codex OAuth primary, Gemma-4 self-heal fallback, repo-digest tool surface.
  • Entitlement-funded inference. Codex OAuth means marginal cost can trend to zero when a ChatGPT subscription covers it. Gemma fallback keeps the analyzer working when Codex blips.
  • Memory system (Honcho / Mem0). v2 Matching is a learning problem — durable per-partner-fund context across calls comes for free rather than being a custom build.
  • MCP-native. Airtable MCP, Gmail MCP, CRM MCPs plug in without code changes. Lines up with v3 Founder Prep nicely.
  • Skills hub + learning loop. v3's AI-avatar interview maps onto Hermes's skill-extraction pattern cleanly.
  • Cron scheduler built in — scheduled weekly digest emails to partner funds, no custom scheduler needed.

Hermes cons

  • Rewrites ClawBox inheritance. Onboarding wizard, SettingsView, PluginsView, GatewayRestartBanner, internal/compatibility.json all assume OpenClaw. Touches most of them.
  • API shape mismatch. Hermes is an agent runtime — you talk to session loops, not a raw chat-completion endpoint. The analyzer wants one-shot structured JSON, which is exactly OpenClaw's sweet spot. Bridging requires either a thin wrapper around Hermes's model-normalize layer or framing each call as a fresh session.
  • Loses dev ergonomics. scripts/mock-gateway.mjs + smoke:backend only exist because OpenClaw has a defined minimum RPC surface. Hermes has no equivalent mock — smoke tests get more expensive.
  • Support boundary blurs. OpenClaw cleanly splits "client bug vs gateway bug." Swapping means MatchBox inherits Pi-harness operational concerns (self-heal, Codex-OAuth expiry, Gemma fallback).
  • Forbes-rollout risk. Swap delays the first partner-facing demo. Not a technical risk; a delivery risk — Jake's Friday meeting is the gate.
  • Harder self-host for partners. OpenClaw: npm i -g openclaw@latest. Hermes: Python venv + service supervisor. If partner funds ever self-host, OpenClaw is a gentler ask.

OpenClaw pros (reasons to stay)

  • Already working. MatchBox is a ClawBox fork; every wire is already run.
  • Purpose-built for this pattern. Desktop client + portable-vs-system local runtime is OpenClaw's explicit design. Hermes doesn't have that shape.
  • Consistent with the portfolio. ClawBox, OpenClaw, MatchBox, and future OpenClaw-family desktop products share a single backend contract.
  • Defined minimum RPC surface. models.list, sessions.*, chat.*, config.* — testable against a mock, easy to reason about.

OpenClaw cons

  • Delivery channels not as mature. Supported, but not deployed through them at Hermes's scale.
  • No persistent memory out of the box. v2 Matching + v3 Founder Prep both lean on "remember across calls" — more custom work under OpenClaw.
  • No entitlement-funded inference path. OpenClaw bills through whatever provider you configure. Hermes on the Pi harness has the Codex OAuth escape hatch.

Phased recommendation (memory-first)

v1 Signal · dual runtime from day oneOpenClaw analyzer unchanged. Add Hermes memory writes + delivery peer. Memory corpus starts accumulating from call one.
v2 MatchingMatching engine queries the Hermes memory graph. By now the corpus is deep — surfaces unforeseen fund/founder pairings across sectors and stages.
v3 Founder PrepAI avatar interview uses the same memory graph. Founder side of the map populates, not just the VC side.
Revised heuristic
Start writing to memory before you need it. You can't retroactively populate Honcho with calls that already happened. Every day MatchBox runs without Hermes memory is a day of corpus loss. OpenClaw stays because it works; Hermes joins because memory is the whole product thesis.

PII policy — per-fund local fine-tunes

Transcripts contain named founders, named funds, named partners, financial figures, and confidential thesis signal. None of it leaves the Pi harness. The analyzer runs against a per-fund fine-tuned local Gemma served by MLX (Apple's Metal-accelerated ML framework); the base Gemma-4 4-bit quant serves as the fallback for funds without a fine-tune yet, via mlx_lm.server on :8080.

Why MLX on the Mac mini
claws-mac-mini is Apple silicon (M-series). MLX is native Metal — better memory efficiency than llama.cpp for fine-tuning, first-class LoRA tooling via mlx_lm.lora, single-framework stack for both inference and training. The mlx_lm.server exposes an OpenAI-compatible /v1/chat/completions endpoint so client code (local-gemma.ts) doesn't care that the runtime changed under it.
The hard rule
No transcript ever crosses the Pi-harness boundary. Codex OAuth, the ChatGPT backend, any external inference path — all barred from seeing call content. Metadata (fund id, call id, rating) can travel over the wire for delivery purposes. Transcript bytes cannot.
PII-safe analyzer routing · everything inside the dashed border stays local ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐ claws-mac-mini · Tailnet-only │ │ analyzer(fund, transcript) │ lookup fund model │ │ ├──▶ fine-tune exists? │ │ │ │ │ │ mlx_lm.server · fund-N/ + LoRA fine-tuned on that fund's │ │ ~/.hermes/models/fund-N/ prior CallSignal corpus │ │ │ │ │ │ structured JSON │ │ │ └──▶ no fine-tune yet? │ mlx_lm.server · mlx-community/gemma-4-it-4bit (base, :8080) │ │ structured JSON │ │ ▼ │ CallSignal persists locally ──▶ Hermes Honcho (fact writes) │ └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─│─ ─ ─ ─ ─ ─ ─ ─ ─ ┘ metadata only Hermes delivery · Slack · email · digest fund id · call id · rating · summary headline NEVER the raw transcript or red-flag quotes

Fine-tune pipeline (MLX)

  1. Corpus. Each fund's CallSignal history (analyzer inputs + outputs) is the training set. Stored locally under ~/.hermes/corpora/fund-<id>/ in JSONL. Never exported.
  2. Base model. mlx-community/gemma-4-it-4bit — Gemma 4 instruction-tuned, 4-bit MLX quantisation. Converted once via mlx_lm.convert and cached at ~/.hermes/models/gemma-base/.
  3. Training. LoRA adapters via mlx_lm.lora --model gemma-base --train --data <fund-jsonl> --adapter-path <fund-dir>. Runs on-host; the M-series GPU handles it in minutes-to-hours depending on corpus size.
  4. Trigger. Automatic when a fund's corpus crosses a call-count threshold (e.g. 25 analysed calls), or manual via the CLI matchbox finetune <fundId>.
  5. Output. Per-fund directory at ~/.hermes/models/fund-<id>/ containing the LoRA adapter + config. mlx_lm.server loads by path; optionally mlx_lm.fuse bakes the adapter into a standalone fund model for faster cold loads.
  6. Rotation. Retrain periodically (monthly cadence to start) as the corpus grows. Prior directory preserved as fund-<id>.v<N>/ for rollback.
  7. Tenancy. One adapter per fund. A fund's adapter never sees another fund's calls.

What the analyzer provider module looks like

// internal/providers/local-gemma.ts (replaces the OpenClaw path for PII-bearing calls) // backed by mlx_lm.server — OpenAI-compatible, so the client shape is unchanged export async function analyzeWithFundModel(fundId, messages) { const modelAlias = `fund-${fundId}` // e.g. fund-f123 const hasFineTune = await adapterExists(modelAlias) const model = hasFineTune ? modelAlias : 'gemma-base' return fetch('http://127.0.0.1:8080/v1/chat/completions', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ model, messages, temperature: 0.2 }), }) } // one-time host setup · done during Phase 0 scaffolding // brew install python@3.11 // pip install mlx-lm // mlx_lm.convert --hf-path google/gemma-4-it -q --q-bits 4 --mlx-path ~/.hermes/models/gemma-base // mlx_lm.server --model ~/.hermes/models/gemma-base --port 8080 --adapter-path ~/.hermes/models

Policy invariants (enforced in code)

  • analyzeTranscript may only call providers whose base URL is 127.0.0.1 or a Tailnet-scoped host. Lint-gated.
  • internal/providers/openclaw-completions.ts is reserved for non-PII tasks (title suggestions, settings probes). An explicit allowExternal: false flag gates every call path.
  • Delivery payloads pass through a sanitiser (internal/delivery/sanitiser.ts) that strips transcriptExcerpt and the redFlags[].quote literal text. Only reason and structured scores travel.
  • Memory writes stay on the Pi harness — Honcho is local; no Mem0 cloud ingestion for fund/founder facts.

Memory write + query

how a CallSignal becomes a matchable fact graph · write path + read path WRITE PATH — every analysed call CallSignal (id=c789, fundId=f123, founderName="Alice Kim", rating=7.8) signal-memory.write(signal) internal/memory/signal-memory.ts structured facts ──────▶ Hermes Honcho fund:f123/thesisAlignment/sector:fintech/stage:seed = 8.2 fund:f123/coachingTone = "pragmatic-over-visionary" founder:"Alice Kim"/redFlagPattern/regulatory = "we'll ignore SOC2 until Series A" founder:"Alice Kim"/enthusiasm = 9.1 call:c789/summary · /rating=7.8 · /model=gpt-4 · /callDate=2026-04-22 persisted in ~/.hermes/memories/ on claws-mac-mini READ PATH — later, when v2 matching (or a v1 suggestion panel) asks GET /api/signal/memory/suggestions?founderName=Alice Kim Hermes Honcho query — "which funds: • have thesisAlignment[sector:fintech][stage:seed] > 7 • AND coachingTone compatible with founder's enthusiasm style • AND do NOT flag regulatory risk as blocker • AND haven't seen Alice in the last 30 days" ranked match candidates [{ fund, score, reason-chain }, …] SignalView panel — "What else does this fund care about?" visible in v1 before v2 formally ships

Integration plan — concrete file map

Phase 0 · scaffolding (days)

  • internal/providers/hermes-client.ts — thin client to the Hermes gateway on claws-mac-mini. Mirrors the shape of openclaw-completions.ts but targets ai.hermes.gateway's HTTP surface. Bearer-auth via a token stored alongside the existing gateway secrets.
  • internal/memory/signal-memory.ts — the memory write layer. Every completed CallSignal emits a set of structured facts to Hermes Honcho, keyed by fund id, founder name, sector, stage, and call date.
  • internal/delivery/hermes.ts — routes a CallSignal into Slack / WhatsApp / email via Hermes platform adapters. Reuses the same client.
  • Settings: new fields for Hermes gateway URL + per-fund delivery channel id. Backwards-compat — if unset, delivery is skipped and memory writes are a no-op (preserves the current OpenClaw-only flow for tests).

Phase 1 · memory writes during v1 (weeks)

  • Hook into POST /api/signal/calls. After the analyzer returns a CallSignal, fire-and-forget a memory write with facts like:
    • fund:<id>/thesisAlignment/sector:<s>/stage:<t> = N
    • founder:<name>/redFlagPattern/<pattern> = quote
    • fund:<id>/coachingTone/<style>
    • call:<id>/summary · /rating · /model · /callDate
  • Write errors must not block the user — Hermes is additive, not on the critical path for v1 demo.
  • Log memory-write failures to internal/logger; expose a health indicator in SettingsView so operators know when the memory substrate is stale.

Phase 2 · reads + UI surfacing (follow-up)

  • GET /api/signal/memory/suggestions?fundId=&founderName= — queries Hermes Honcho for cross-call patterns and returns ranked match candidates.
  • New panel in SignalView: "What else does this fund care about?" powered by memory queries — visible insight even before v2 matching ships.
  • Nightly cron on the Pi harness runs a memory-graph digest that gets posted into the Crowley Capital Slack channel.

Phase 3 · promotion to primary runtime (v3)

  • v3 Founder Prep lands. AI avatar interview writes founder-side facts into the same graph.
  • At this point a full OpenClaw → Hermes analyzer swap is cheap (the memory graph is already populated, the client module is already the same shape). Decide then whether to keep the hybrid or consolidate.

What this preserves

  • Forbes-demo safety. OpenClaw analyzer path is unchanged. If Hermes is unavailable, MatchBox degrades gracefully to the current behaviour.
  • Mock-gateway dev loop. Still works. Hermes writes are no-ops when the URL isn't configured.
  • ClawBox inheritance. Onboarding wizard, SettingsView, PluginsView all stay wired to OpenClaw. Hermes config lives in an additive settings section.
  • Support boundary clarity. OpenClaw bugs = client path. Hermes bugs = memory path. Keep them split during the dual-runtime phase.

Risks to watch

  • Memory schema drift. What you write in v1 is what v2 matching has to read. Lock a minimum schema before the first production call. Version it.
  • PII + confidentiality. Hermes memory now holds partner-fund + founder data. The Pi harness is on a Tailnet, but the memory layer is a new data surface — audit the storage + access paths.
  • Hermes gateway availability. Memory writes are fire-and-forget, but a silent back-pressure problem could cost you training data. Monitor the Hermes errors log as part of MatchBox ops.
  • Cost of cross-runtime calls. Analyzer stays on OpenClaw; memory on Hermes. Two gateway auth paths, two config sections, two failure modes. Document them explicitly in SettingsView.
Phase 08 · Hill-climb loops

AutoAgent × AutoResearch — two loops, both local

MatchBox uses two hill-climb loops that never leave claws-mac-mini. AutoAgent evolves the MatchBox harness itself — prompt, validator, tool chain — against a held-out benchmark of real calls. AutoResearch evolves each fund's local fine-tune as its corpus grows. Both honour the PII invariant: no transcript crosses the Pi-harness boundary.

The split in one line
AutoAgent mutates what runs (internal/signal/analyzer.ts + prompt + validators). AutoResearch mutates what gets run against (the per-fund Gemma fine-tune). Same closed loop (ρ · σ · ι · ε · κ), different target, different score function.

Composition — how the two loops compound

the two loops feed each other · everything inside the dashed border stays on claws-mac-mini ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐ │ AUTOAGENT — harness engineering loop │ │ mutates analyzer.ts · prompt · validators │ │ better CallSignal quality │ │ richer Honcho fact graph (fund · founder · sector · stage) │ │ deeper per-fund corpus at ~/.hermes/corpora/fund-<id>/ │ AUTORESEARCH — per-fund fine-tune loop │ │ mutates LoRA config · prompt template · hyperparameters │ │ retrained fund-<id>/ (MLX LoRA adapter) │ │ even better CallSignals for that fund │ └────────────────▶ back to AutoAgent (next round) │ │ └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘

Loop A · AutoAgent for the MatchBox harness

The MatchBox analyzer itself is an agent harness. It has a prompt, a tool surface, a validator chain. AutoAgent treats it as the mutation target and hill-climbs against a benchmark of held-out calls.

AutoAgent loop · mutate the harness, score against held-out calls program.md (human-authored directive) "MatchBox analyzer must score F1 ≥ 0.85 on held-out call set. No external inference. Red-flag quotes must be literal transcript excerpts." Meta-agent (Claude Code · Codex · local — configured with allowExternal=false) ├──▶ reads internal/signal/analyzer.ts ├──▶ reads tasks/ (held-out transcripts + expected CallSignals) ├──▶ reads results.tsv (prior rounds) ρ · Reflect — diagnose failure clusters in last round's trajectories σ · Select — propose prompt / validator / tool edits ι · Improve — apply edits to analyzer.ts (candidate) ε · Evaluate — run candidate against tasks/ via LOCAL Gemma score = mean F1 over (sentiment · redflag-precision · coaching-quality · fit) κ · Commit — keep if score improved, else revert append { round, score, patch, trajectory-diff } to results.tsv └──▶ loop until convergence or budget exhausted

AutoAgent setup in the MatchBox repo

  • program.md at repo root — the only file a human edits regularly. Directive + constraints (no external inference, preserve JSON shape, keep transcript-excerpt truncation).
  • tasks/ — held-out Harbor-format tasks. Each task: a transcript + an expected CallSignal shape (human-annotated rating, sentiment, red-flag categories).
  • results.tsv — append-only round log (score · patch · kept/discarded). Gitignored.
  • Harbor runner points at the local Gemma via --agent-import-path matchbox-analyzer:analyze — never reaches out to an external LLM.
  • Meta-agent = Claude Code or Codex running locally, configured with allowExternal=false so it can't leak transcript fragments into an external provider's context.

Loop B · AutoResearch for per-fund fine-tunes (MLX)

Each fund's LoRA adapter at ~/.hermes/models/fund-<id>/ is a living artefact. As the fund's corpus grows, the adapter gets retrained via mlx_lm.lora. AutoResearch mutates the fine-tune config (LoRA rank, target-modules, epochs, LR schedule, prompt template) via JSON Patch against a fund-scoped config, and hill-climbs against the fund's held-out eval suite.

AutoResearch loop · mutate fund-<id>/ adapter via mlx_lm.lora, score against the fund's own held-out corpus signal source: every new analysed call ──▶ ~/.hermes/corpora/fund-<id>/ significance check: did this fund's accuracy drift below threshold? yes ──▶ ρ · Reflect — inspect recent prediction errors vs ground-truth CallSignals σ · Select — propose fine-tune mutations (RFC 6902 JSON Patch) ops: replace /lora_rank · add /prompt-template/section · replace /epochs · … ι · Improve — apply patch to fund-<id>/finetune.config.json mlx_lm.lora --model gemma-base --train --data <fund-jsonl> --adapter-path <candidate> candidate adapter at fund-<id>-candidate/ ε · Evaluate — run candidate against the fund's held-out calls score = fund-scoped F1 + coaching-tone-match + red-flag precision κ · Commit — promote candidate to fund-<id>/ if better archive old as fund-<id>.v<N>/ for rollback optional: mlx_lm.fuse to bake adapter into a standalone fused model version store: local git at ~/.hermes/versions/fund-<id>/ + SQLite manifest at ~/.hermes/versions/manifest.db └──▶ append patch + score + model hash; never leaves the host

Why local git + SQLite instead of Cloudflare KV

The sibling GTM-autoresearch deployment uses Cloudflare KV as the version store. For MatchBox, the version store would contain fund-scoped behaviour signal derived from transcripts — still PII-adjacent. So the drift history lives on-host:

  • Local git repo at ~/.hermes/versions/fund-<id>/ — one repo per fund, one commit per round. Each commit contains the applied JSON Patch plus the candidate config. Replayable.
  • SQLite manifest at ~/.hermes/versions/manifest.db — round log (round · fund · score · patch-sha · model-sha · kept/discarded). Queryable without checking out 47 fund repos.
  • No Cloudflare KV for fund data. KV remains available for non-PII surfaces (the hub index, the public guide deploys) but fund-scoped training signal does not touch it.

Meta-agent sandboxing

  • Claude Code or Codex runs locally against the MatchBox repo. OpenClaw's allowExternal gate is set to false on every meta-agent session.
  • Tasks directory is gitignored. Held-out transcripts never reach the remote. If a contributor wants to share a benchmark task, they share a synthetic transcript that carries no real PII.
  • Trajectory logs are sanitised before they land in results.tsv. The sanitiser strips any literal transcript excerpt the meta-agent pasted into its reflection notes.
  • Fine-tune training runs on-host via mlx_lm.lora on the Mac mini's Metal GPU; no cloud training pipeline. Candidate evaluation and scoring also run via mlx_lm.server — the framework handles both inference and training in one Python environment.

Score functions

LoopScoreThreshold
AutoAgent (harness)Mean F1 over sentiment dimensions + red-flag precision + coaching-quality + fit-accuracy · evaluated on the shared held-out task setShip when F1 ≥ 0.85 on the held-out set, with no regression > 0.03 on any dimension
AutoResearch (fund fine-tune)Fund-scoped F1 · coaching-tone match · red-flag precision against the fund's own held-out corpusPromote candidate when ≥ current by ≥ 0.02, with fund-specific partner sign-off

Invariants across both loops

  • No transcript ever reaches an external LLM — score computation runs against the local Gemma only.
  • Per-fund tenancy — AutoResearch writes per fund; no fund's config sees another fund's data.
  • Reverse-patchable mutations — every AutoResearch patch has a canonical reverse. AutoAgent relies on git revert for rollback.
  • Score-gated commit — κ only promotes when the objective improves and safety invariants hold.
  • Evidence lives on-host — results.tsv (AutoAgent), manifest.db + per-fund git repos (AutoResearch). Both exportable as bundles if partners audit.
Why do both loops need to exist
AutoAgent alone improves the harness but leaves each fund's base Gemma undifferentiated — matching quality caps out at what the shared prompt can extract. AutoResearch alone improves per-fund models but can't fix a structurally bad analyzer. Both together give you a harness that extracts progressively sharper signal and a per-fund model that learns the fund's specific voice, so each round of each loop makes the other loop's next round easier.

What lands when

v1 SignalAutoAgent scaffolding in the repo (program.md · tasks/ · local Harbor runner). One synthetic-data round to prove the loop. No per-fund fine-tune yet.
v1 + corpus growthFirst fund crosses the 25-call threshold → AutoResearch loop triggers → first real per-fund fine-tune. AutoAgent rounds resume against the new, harder tasks.
v2 MatchingMatching quality is now a direct function of both loops' output. Hermes Honcho queries run against the per-fund models.
v3 Founder PrepAutoAgent target surface expands — the avatar-interview harness joins the analyzer as a mutation target. AutoResearch gets a third eval dimension (pre-call screening accuracy).
Phase 07 · Ops

Next steps

What's on the calendar, and what needs to land before Forbes rollout broadens.

Open commitments

OwnerItem
JakeMeet Jon Forbes Friday — define compensation structure
JordanCreate technical guides and development assets (this guide + wiki are the first two)
JordanScaffold Hermes dual runtime — memory writes + delivery peer · start accumulating corpus before v2 matching
TeamFollow-up meeting scheduled for Wednesday or sooner
FocusFinalize the Forbes partnership terms before broader rollout

Strategy artefacts

Follow-ups

  • Clean up the package.json name + repo URL (still inherited from ClawBox fork).
  • Rename default branch to something durable once Forbes terms are locked.
  • Promote Signal from operator-only to Slack delivery once the analyzer output stabilises.
  • Draft the v2 Matching spec after the v1 corpus has enough calls to validate the prompt.
Phase 08 · People

Stakeholders

Who's involved and what they own.

Jake

Business lead. Owns the Forbes relationship and the Friday compensation meeting. Runs founder intake through Airtable.

Jordan

Technical lead. Owns technical guides, development assets, and the Signal / Matching delivery surface.

Jon Forbes (Crowley Capital)

Primary partner. Proven software funding pipeline; wants expansion into ambiguous industries. MatchBox plugs into his existing deal flow.

Crowley Capital

Seed node for network depth. Anchor fund for the case study.

Organized AI

Parent org. Ships MatchBox alongside ClawBox, OpenClaw, and the broader market projects.

matchbox-guide
Organized-AI/MatchBox · private
Tauri v2 · Bun · React 18
matchbox-guide.pages.dev/#home