Organized AI · MatchBox

MatchBox — strike the match between founders and VCs

A matching layer that sits on top of a partner fund's existing deal flow. Listens to VC-founder calls, scores sentiment + fit, surfaces coaching notes for the founder and ratings for the VC, and turns high-confidence pairings into portable signal. Not another CRM — the engine that makes the next-best founder obvious.

v1 Signal · shipping

v2 Matching · planned

v3 Founder Prep · planned

Forked from ClawBox (Tauri + Bun + React 18)

Forbes-first rollout

Revenue model in negotiation

Product thesis

The best founder–VC matches are obvious in the first 20 minutes of a call. MatchBox makes that signal legible, portable, and actionable — and, critically, durable and confidential. Memory is the cornerstone of the matching problem; the non-obvious connections between funds and founders only surface once a deep corpus exists. Transcripts are PII — so every analyzer call runs against a per-fund fine-tuned local Gemma on claws-mac-mini. Nothing sensitive crosses the Pi-harness boundary. See Runtime Choice.

Four moves per call

listens

Transcribes every VC-founder call and scores sentiment, fit, and red flags. Captures training data for v2 matching.

matches

Pairs founders with the right fund based on call signal plus intake data, not just thesis keywords. (v2 deliverable.)

coaches

Returns founder-ready coaching notes and VC-ready call ratings after each conversation. Closes the feedback loop.

connects

High-confidence deals get shared across a trusted VC network as the subscriber base grows. Network depth compounds.

At-a-glance flow

end-to-end · dual runtime from v1 [Airtable intake] founders sign up │ sync ▼ VC call ──▶ transcript ──▶ SignalView ──▶ POST /api/signal/calls │ ┌────────────────────┼────────────────────┐ ▼ ▼ ▼ [OpenClaw] [Hermes] [local store] analyzer memory + delivery CallSignal[] one-shot JSON Honcho · skills FundProfile[] │ │ ▼ ├──▶ Slack CallSignal ├──▶ WhatsApp │ ├──▶ email └────────────────────┴──▶ nightly digest

Seed node

Crowley Capital is the seed node for the curated VC network. Partner contact: Jon Forbes.

At a glance

Key	Value
Repo	Organized-AI/MatchBox (private · MIT)
Default branch	`claude/founder-fund-matching-urzts`
Base project	Forked from ClawBox (still reports `clawbox` in package.json)
Shipping surface	Signal · `src/components/SignalView.tsx`
Founder intake	Airtable form
Strategy call	Granola notes

Phase 01 · Strategy

Strategy

Upgrade a working 7/10 VC workflow into the 10/10 industry benchmark by wiring MatchBox into a partner fund's existing process. No forced platform migration.

Forbes-first rollout

Primary · Jon ForbesProven software funding pipeline · wants ambiguous-industry expansion

Non-exclusiveForbes is the case study; same system offered to other funds once proven

Why Forbes first

MatchBox slots in as the matching + signal layer on top of existing deal flow — no CRM migration.
Forbes's proven pipeline + infrastructure means integration is additive, not invasive.
Ambiguous-industry expansion is the explicit thesis he wants — matches MatchBox's core capability.
A reference customer earlier means the subscription tier is easier to price on real outcomes.

Revenue model — in negotiation

Jake meets Jon Forbes this Friday to lock compensation terms before broader rollout. Options on the table:

Revenue share on existing portfolio companies uplifted by MatchBox.
Deal participation in successful post-implementation investments.
Subscription for other funds accessing the MatchBox network.

Open question

Terms are not locked yet. All downstream rollout plans depend on the Friday meeting landing.

Phase 02 · Roadmap

Product Phases

Three product versions, compounding value — signal captures the training data for matching, matching's ranking signals feed founder prep.

v1 · Signal · shipping

Turns each VC-founder call into structured signal. Transcribe → sentiment + fit + red flags → coaching notes for the founder + ratings for the VC. Captures training data for v2.

v2 · Matching · next

High-confidence founder ↔ fund pairings. Cross-network deal sharing between subscribed funds. Match quality improves fund-over-fund as the data pool grows.

v3 · Founder Prep · later

AI avatar conducts the initial founder interview. Automated screening before the VC meeting. Due-diligence layer that cuts fraudulent claims and wasted partner time.

phases compound · each one produces the training input for the next [v1 · SIGNAL] [v2 · MATCHING] [v3 · FOUNDER PREP] transcribe founder ↔ fund pairings AI avatar interview sentiment · fit · flags cross-network deal share automated screening coaching notes match quality compounds due-diligence layer │ ▲ ▲ │ every CallSignal │ ranked matches + reasons │ founder-side facts │ writes to Hermes memory │ inform next prep cycle │ (avatar-interview output) ▼ │ │ └──────────────────▶ HERMES HONCHO · fact graph ◀──────────────────┘ fund · founder · sector · stage · red-flag patterns · coaching tone lives on claws-mac-mini at ~/.hermes/memories/

v1 surface (today)

src/components/SignalView.tsx — the operator UI.
src/types/signal.ts — FundProfile · CallSignal · SentimentScores · RedFlagHit · AirtableConfigView · AnalyzeCallInput.
src/services/signal.ts — fetch client against the local backend at http://127.0.0.1:13000.
src/store/signal.ts — Zustand state: funds, calls, Airtable config, loading/syncing/analyzing flags.
internal/routes/signal.ts — Hono routes (Airtable config, sync, funds CRUD, analyzer).
internal/signal/{index,airtable,analyzer}.ts — store + Airtable sync + LLM analyzer.

What Signal is not

Not a recording tool — works off transcripts you already have.
Not a standalone SaaS — it's an internal operator surface plus agent delivery into the VC's existing tools.
Not yet the matching engine — v1 captures the training signal for v2.

Phase 03 · Signal

Signal — the v1 shipping surface

Signal turns a call into a structured CallSignal: rating, sentiment scores, red flags with quoted evidence, fact-finding prompts, and coaching notes for the VC.

Data model

interface FundProfile { id, name, thesis, stage, checkSize, sectors[], redFlags, notes source: 'manual' | 'airtable' airtableRecordId?, createdAtMs, updatedAtMs } interface CallSignal { id, fundId, fundName, founderName, callDate transcriptExcerpt, rating (0-10) sentiment: { enthusiasm, thesisAlignment, concern } // each 0-10 redFlags: [{ quote, reason }] factFindingPrompts: string[] coachingForVc, summary model?, createdAtMs } interface AirtableConfigView { apiKey (masked), hasApiKey, baseId, tableName, viewName }

Backend routes — `internal/routes/signal.ts`

Method + path	Role
GET /api/signal/airtable	Current Airtable config (API key masked to last 4)
PUT /api/signal/airtable	Patch config · supports `clearApiKey`
POST /api/signal/airtable/sync	Pull `FundProfile` rows from Airtable
GET · POST · PUT · DELETE /api/signal/funds	Manual fund CRUD (coexists with Airtable-sourced rows)
GET · POST · DELETE /api/signal/calls	Call list + analyze + delete

Per-call flow

one POST → one CallSignal + two Hermes side-effects SignalView ──▶ POST /api/signal/calls { fundId, founderName, callDate, transcript } │ ▼ internal/routes/signal.ts │ ▼ analyzeTranscript(fund, transcript) internal/signal/analyzer.ts │ ▼ callOpenclawDefaultModelChatCompletion() OpenClaw gateway │ ▼ parse JSON · clampScore · parseRedFlags(max 8) · parsePrompts(max 8) │ ▼ CallSignal { rating · sentiment · redFlags · factFindingPrompts · coachingForVc · summary · transcriptExcerpt ≤4000 } │ ┌─────────────────────────────────┼─────────────────────────────────┐ ▼ ▼ ▼ fire-and-forget persist in fire-and-forget Hermes memory write internal/signal/index Hermes delivery (signal-memory.ts) (delivery/hermes.ts) │ │ │ ▼ ▼ ▼ Honcho facts keyed by 200 { call: CallSignal } Slack · WhatsApp · fund · founder · sector email · digest │ ▼ zustand store ──▶ SignalView re-render

Analyzer — `internal/signal/analyzer.ts`

Builds the prompt from the FundProfile + transcript (truncated to MAX_EXCERPT_CHARS = 20_000), calls callOpenclawDefaultModelChatCompletion, validates the JSON response, and produces an AnalyzeResult that gets wrapped into a persisted CallSignal.

Score clamping — each numeric score rounded to 0.1 and bounded to [0, 10].
Red flags cap — max 8 per call, each carrying a literal quote + reason.
Fact-finding prompts — max 8; the next-call follow-up questions the VC should ask.
Transcript truncation is explicit — appends …[truncated N chars] so the model knows.

Why Airtable

The founder intake form is already on Airtable — Jake runs the top of funnel there. Signal syncs fund profiles from Airtable rather than inventing a new schema. Base id + table name + view are all config.

Phase 04 · System

Architecture

MatchBox inherits the ClawBox three-process split: Tauri shell · React 18 SPA · Bun/Hono backend. Signal adds one route module, one store, and one analyzer over that base.

Processes [Tauri v2 shell · Rust] src-tauri/ OS integration · autoupdate │ ▼ [React 18 + Vite SPA] src/ Zustand · Tailwind · en / zh i18n │ fetch http://127.0.0.1:13000 ▼ [Bun / Hono backend] internal/ routes incl. signal.ts │ ├──▶ Airtable fund intake sync │ ▼ WebSocket RPC [OpenClaw Gateway] http://127.0.0.1:18789/v1 ▼ [LLM provider] via callOpenclawDefaultModelChatCompletion

dual runtime · local Gemma for PII-bearing analysis · Hermes for memory + delivery · OpenClaw non-PII only Bun / Hono backend (127.0.0.1:13000) │ ┌──────────────────────────┼──────────────────────────┐ ▼ ▼ ▼ [local Gemma · MLX] [Hermes Gateway] [OpenClaw Gateway] mlx_lm.server :8080 ai.hermes.gateway :18789/v1 fund-<id>/ LoRA adapters claws-mac-mini non-PII tasks only PII-sensitive analyzer memory + delivery title suggestions, │ │ settings probes ▼ ├──▶ Honcho facts (~/.hermes/memories/) structured JSON ├──▶ Slack · WhatsApp · email · digest analyzer output └──▶ metadata-only delivery payloads │ ▲ ▼ │ CallSignal persisted locally └── fire-and-forget write ◀── local analyzer output degradation mode — if Hermes is unreachable, memory + delivery skip silently; the analyzer path and the SignalView UX are unaffected. PII invariant — no transcript crosses the Pi-harness boundary. Lint-gated in code.

Repo layout

. ├── src/ │ ├── components/ │ │ ├── SignalView.tsx # v1 operator UI │ │ ├── ChatView · OnboardView · SettingsView · … # inherited from ClawBox │ │ └── chat/ · layout/ · plugins/ · sidebar/ · skills/ · soul/ · ui/ │ ├── services/signal.ts # fetch client │ ├── store/signal.ts # zustand │ └── types/signal.ts # shared TS types ├── internal/ │ ├── routes/signal.ts # Hono routes │ ├── signal/ │ │ ├── index.ts # store (funds + calls + airtable cfg) │ │ ├── airtable.ts # syncFundsFromAirtable │ │ └── analyzer.ts # analyzeTranscript via OpenClaw default model │ ├── providers/openclaw-completions.ts # model bridge │ ├── channels · onboard · skills · titles · tool-call-summaries │ └── routes/{agents,channels,chat,config,cron,models,onboard,sessions,skills,soul,titles,tools}.ts ├── src-tauri/ # Tauri v2 Rust shell ├── scripts/ # mock-gateway · sync-version · sign-win · … ├── docs/ # dependency-policy · openclaw-compatibility · releases · releasing ├── public/ · test/ · index.html └── package.json (still named "clawbox" · v2026.3.17)

Inherited from ClawBox

All 12 backend routes (agents · channels · chat · config · cron · models · onboard · sessions · skills · soul · titles · tools).
All non-Signal frontend views (Chat · Cron · Onboard · Plugins · Settings · Skills · Soul · Startup · etc.).
Onboarding wizard, Gateway-restart banner, language toggle, theme toggle, i18n en/zh.
Scripts: mock-gateway.mjs, smoke-backend, sync-version, Windows signing chain, macOS .dmg build.
See the ClawBox guide and ClawBox wiki for the base surface.

What's new versus ClawBox

SignalView.tsx, signal.ts type/service/store trio, routes/signal.ts, internal/signal/*.
MatchBox branding (README, banner.png, logo SVG), founder-intake Airtable hookup.

Phase 05 · Delivery model

Agent, not platform

MatchBox ships as an agent. Each VC gets a custom GPT trained on their own process, reachable where they already work — not as a forced migration to a new app.

Surfaces

Slack

Coaching notes + call ratings posted directly into the fund's working channel.

Same signal, delivered where the partner already texts with founders.

Email

Follow-up + next-call prompts surfaced alongside the VC's existing thread.

Airtable form

Founder intake — the signup link. Rows sync into MatchBox's fund store.

Delivery fanout

CallSignal → Hermes gateway → where partner funds actually see it CallSignal │ ▼ internal/delivery/hermes.ts │ ▼ Hermes gateway (claws-mac-mini) ai.hermes.gateway · launchd-supervised │ ┌────────────┬────────────┬────┼────────────┬──────────────────┐ ▼ ▼ ▼ ▼ ▼ ▼ Slack WhatsApp Telegram email SMS cron digest Socket Mode Business API bot gateway SMTP/Gmail MCP provider bridge nightly → Crowley @claude_code_ Capital channel slack ▲ │ Hermes cron scheduler 0 9 * * * · per-run cost cap · pause/resume

Desktop shell is optional

The Tauri desktop app in this repo is the internal operator surface. Partner funds don't have to adopt it to get value — the agent can run headlessly and reach them through the channels above.

Network effects

A curated VC network grows as more funds subscribe.
Cross-fund deal sharing for premium members (v2).
Crowley Capital seeds network depth.
Closes the industry due-diligence gap: fewer fraudulent claims, better fit, less wasted time on poor matches.

Phase 06 · Running from source

Dev

Engine-identical to ClawBox. Same npm ci → npm run dev loop.

Install + run

npm ci npm run dev # frontend (Vite :14200) + backend (Bun :13000) npm run tauri:dev # desktop shell

OpenClaw for the analyzer

npm install -g openclaw@latest openclaw gateway run --dev --auth none --bind loopback --port 18789

Build + verify

npm run build:frontend npm run build:backend cargo check --manifest-path src-tauri/Cargo.toml

Repo hygiene

npm run scan:repo npm run audit:licenses npm run audit:deps npm run smoke:backend # against mock-gateway.mjs

Branch note

Default branch today: claude/founder-fund-matching-urzts. No main at repo root yet — signal + matching work land on topic branches.

Phase 07 · Runtime choice · memory-first

OpenClaw + Hermes — dual runtime from v1

Memory is the cornerstone of MatchBox. Matching funds to founders requires connections that only surface after the corpus is deep enough — unforeseen pattern matches across sectors, stages, and partner appetites. That means Hermes can't wait until v3. It lands alongside OpenClaw in v1, as the memory substrate.

The only real coupling

MatchBox calls callOpenclawDefaultModelChatCompletion in internal/providers/openclaw-completions.ts. That's the sole tight link to OpenClaw. Everything else — the Tauri shell, React frontend, Bun backend, SignalView, the Airtable sync, the analyzer's validation layer — is provider-agnostic. A swap is a provider-module rewrite; an addition (what we're doing) is even cheaper.

The strategic bet

Memory-as-cornerstone. A matching engine without durable cross-call memory is just a sentiment scorer. Founder ↔ fund fit depends on patterns that only emerge over time — "fund X likes early fintech but balks at regulatory risk," "founder Y's second call went better than their first," "these three founders all flagged the same market signal." None of that is visible in a single call. All of it is visible if we write to Hermes memory from call one.

Dual-runtime plan (adopted)

OpenClaw · analyzer

Keeps doing what it's best at — one-shot structured-JSON analysis of a transcript. Fast, low-latency, mock-testable, Forbes-demo safe. No changes.

Hermes · memory + delivery

Writes every CallSignal into Honcho/Mem0 keyed by fund + founder + sector + stage. Queries return cross-call pattern matches. Same runtime also handles Slack/WhatsApp/email delivery.

Hermes pros

Platform delivery is native. The "agent, not platform" drop into Slack / WhatsApp / email / Telegram is Hermes's core design. MatchBox's stated delivery model maps 1:1.
Already running on claws-mac-mini. Reuse the Pi harness — inherit the launchd supervision, Codex OAuth primary, Gemma-4 self-heal fallback, repo-digest tool surface.
Entitlement-funded inference. Codex OAuth means marginal cost can trend to zero when a ChatGPT subscription covers it. Gemma fallback keeps the analyzer working when Codex blips.
Memory system (Honcho / Mem0). v2 Matching is a learning problem — durable per-partner-fund context across calls comes for free rather than being a custom build.
MCP-native. Airtable MCP, Gmail MCP, CRM MCPs plug in without code changes. Lines up with v3 Founder Prep nicely.
Skills hub + learning loop. v3's AI-avatar interview maps onto Hermes's skill-extraction pattern cleanly.
Cron scheduler built in — scheduled weekly digest emails to partner funds, no custom scheduler needed.

Hermes cons

Rewrites ClawBox inheritance. Onboarding wizard, SettingsView, PluginsView, GatewayRestartBanner, internal/compatibility.json all assume OpenClaw. Touches most of them.
API shape mismatch. Hermes is an agent runtime — you talk to session loops, not a raw chat-completion endpoint. The analyzer wants one-shot structured JSON, which is exactly OpenClaw's sweet spot. Bridging requires either a thin wrapper around Hermes's model-normalize layer or framing each call as a fresh session.
Loses dev ergonomics. scripts/mock-gateway.mjs + smoke:backend only exist because OpenClaw has a defined minimum RPC surface. Hermes has no equivalent mock — smoke tests get more expensive.
Support boundary blurs. OpenClaw cleanly splits "client bug vs gateway bug." Swapping means MatchBox inherits Pi-harness operational concerns (self-heal, Codex-OAuth expiry, Gemma fallback).
Forbes-rollout risk. Swap delays the first partner-facing demo. Not a technical risk; a delivery risk — Jake's Friday meeting is the gate.
Harder self-host for partners. OpenClaw: npm i -g openclaw@latest. Hermes: Python venv + service supervisor. If partner funds ever self-host, OpenClaw is a gentler ask.

OpenClaw pros (reasons to stay)

Already working. MatchBox is a ClawBox fork; every wire is already run.
Purpose-built for this pattern. Desktop client + portable-vs-system local runtime is OpenClaw's explicit design. Hermes doesn't have that shape.
Consistent with the portfolio. ClawBox, OpenClaw, MatchBox, and future OpenClaw-family desktop products share a single backend contract.
Defined minimum RPC surface. models.list, sessions.*, chat.*, config.* — testable against a mock, easy to reason about.

OpenClaw cons

Delivery channels not as mature. Supported, but not deployed through them at Hermes's scale.
No persistent memory out of the box. v2 Matching + v3 Founder Prep both lean on "remember across calls" — more custom work under OpenClaw.
No entitlement-funded inference path. OpenClaw bills through whatever provider you configure. Hermes on the Pi harness has the Codex OAuth escape hatch.

Phased recommendation (memory-first)

v1 Signal · dual runtime from day oneOpenClaw analyzer unchanged. Add Hermes memory writes + delivery peer. Memory corpus starts accumulating from call one.

v2 MatchingMatching engine queries the Hermes memory graph. By now the corpus is deep — surfaces unforeseen fund/founder pairings across sectors and stages.

v3 Founder PrepAI avatar interview uses the same memory graph. Founder side of the map populates, not just the VC side.

Revised heuristic

Start writing to memory before you need it. You can't retroactively populate Honcho with calls that already happened. Every day MatchBox runs without Hermes memory is a day of corpus loss. OpenClaw stays because it works; Hermes joins because memory is the whole product thesis.

PII policy — per-fund local fine-tunes

Transcripts contain named founders, named funds, named partners, financial figures, and confidential thesis signal. None of it leaves the Pi harness. The analyzer runs against a per-fund fine-tuned local Gemma served by MLX (Apple's Metal-accelerated ML framework); the base Gemma-4 4-bit quant serves as the fallback for funds without a fine-tune yet, via mlx_lm.server on :8080.

Why MLX on the Mac mini

claws-mac-mini is Apple silicon (M-series). MLX is native Metal — better memory efficiency than llama.cpp for fine-tuning, first-class LoRA tooling via mlx_lm.lora, single-framework stack for both inference and training. The mlx_lm.server exposes an OpenAI-compatible /v1/chat/completions endpoint so client code (local-gemma.ts) doesn't care that the runtime changed under it.

The hard rule

No transcript ever crosses the Pi-harness boundary. Codex OAuth, the ChatGPT backend, any external inference path — all barred from seeing call content. Metadata (fund id, call id, rating) can travel over the wire for delivery purposes. Transcript bytes cannot.

PII-safe analyzer routing · everything inside the dashed border stays local ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐ claws-mac-mini · Tailnet-only │ │ analyzer(fund, transcript) │ │ │ ▼ │ lookup fund model │ │ │ ├──▶ fine-tune exists? │ │ │ │ │ ▼ │ │ mlx_lm.server · fund-N/ + LoRA fine-tuned on that fund's │ │ ~/.hermes/models/fund-N/ prior CallSignal corpus │ │ │ │ │ ▼ │ │ structured JSON │ │ │ └──▶ no fine-tune yet? │ │ │ ▼ │ mlx_lm.server · mlx-community/gemma-4-it-4bit (base, :8080) │ │ │ ▼ │ structured JSON │ │ │ ▼ │ CallSignal persists locally ──▶ Hermes Honcho (fact writes) │ │ └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─│─ ─ ─ ─ ─ ─ ─ ─ ─ ┘ ▼ metadata only Hermes delivery · Slack · email · digest fund id · call id · rating · summary headline NEVER the raw transcript or red-flag quotes

Fine-tune pipeline (MLX)

Corpus. Each fund's CallSignal history (analyzer inputs + outputs) is the training set. Stored locally under ~/.hermes/corpora/fund-<id>/ in JSONL. Never exported.
Base model. mlx-community/gemma-4-it-4bit — Gemma 4 instruction-tuned, 4-bit MLX quantisation. Converted once via mlx_lm.convert and cached at ~/.hermes/models/gemma-base/.
Training. LoRA adapters via mlx_lm.lora --model gemma-base --train --data <fund-jsonl> --adapter-path <fund-dir>. Runs on-host; the M-series GPU handles it in minutes-to-hours depending on corpus size.
Trigger. Automatic when a fund's corpus crosses a call-count threshold (e.g. 25 analysed calls), or manual via the CLI matchbox finetune <fundId>.
Output. Per-fund directory at ~/.hermes/models/fund-<id>/ containing the LoRA adapter + config. mlx_lm.server loads by path; optionally mlx_lm.fuse bakes the adapter into a standalone fund model for faster cold loads.
Rotation. Retrain periodically (monthly cadence to start) as the corpus grows. Prior directory preserved as fund-<id>.v<N>/ for rollback.
Tenancy. One adapter per fund. A fund's adapter never sees another fund's calls.

What the analyzer provider module looks like

// internal/providers/local-gemma.ts (replaces the OpenClaw path for PII-bearing calls) // backed by mlx_lm.server — OpenAI-compatible, so the client shape is unchanged export async function analyzeWithFundModel(fundId, messages) { const modelAlias = `fund-${fundId}` // e.g. fund-f123 const hasFineTune = await adapterExists(modelAlias) const model = hasFineTune ? modelAlias : 'gemma-base' return fetch('http://127.0.0.1:8080/v1/chat/completions', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ model, messages, temperature: 0.2 }), }) } // one-time host setup · done during Phase 0 scaffolding // brew install python@3.11 // pip install mlx-lm // mlx_lm.convert --hf-path google/gemma-4-it -q --q-bits 4 --mlx-path ~/.hermes/models/gemma-base // mlx_lm.server --model ~/.hermes/models/gemma-base --port 8080 --adapter-path ~/.hermes/models

Policy invariants (enforced in code)

analyzeTranscript may only call providers whose base URL is 127.0.0.1 or a Tailnet-scoped host. Lint-gated.
internal/providers/openclaw-completions.ts is reserved for non-PII tasks (title suggestions, settings probes). An explicit allowExternal: false flag gates every call path.
Delivery payloads pass through a sanitiser (internal/delivery/sanitiser.ts) that strips transcriptExcerpt and the redFlags[].quote literal text. Only reason and structured scores travel.
Memory writes stay on the Pi harness — Honcho is local; no Mem0 cloud ingestion for fund/founder facts.

Memory write + query

how a CallSignal becomes a matchable fact graph · write path + read path WRITE PATH — every analysed call CallSignal (id=c789, fundId=f123, founderName="Alice Kim", rating=7.8) │ ▼ signal-memory.write(signal) internal/memory/signal-memory.ts │ ▼ structured facts ──────▶ Hermes Honcho fund:f123/thesisAlignment/sector:fintech/stage:seed = 8.2 fund:f123/coachingTone = "pragmatic-over-visionary" founder:"Alice Kim"/redFlagPattern/regulatory = "we'll ignore SOC2 until Series A" founder:"Alice Kim"/enthusiasm = 9.1 call:c789/summary · /rating=7.8 · /model=gpt-4 · /callDate=2026-04-22 │ ▼ persisted in ~/.hermes/memories/ on claws-mac-mini READ PATH — later, when v2 matching (or a v1 suggestion panel) asks GET /api/signal/memory/suggestions?founderName=Alice Kim │ ▼ Hermes Honcho query — "which funds: • have thesisAlignment[sector:fintech][stage:seed] > 7 • AND coachingTone compatible with founder's enthusiasm style • AND do NOT flag regulatory risk as blocker • AND haven't seen Alice in the last 30 days" │ ▼ ranked match candidates [{ fund, score, reason-chain }, …] │ ▼ SignalView panel — "What else does this fund care about?" visible in v1 before v2 formally ships

Integration plan — concrete file map

Phase 0 · scaffolding (days)

internal/providers/hermes-client.ts — thin client to the Hermes gateway on claws-mac-mini. Mirrors the shape of openclaw-completions.ts but targets ai.hermes.gateway's HTTP surface. Bearer-auth via a token stored alongside the existing gateway secrets.
internal/memory/signal-memory.ts — the memory write layer. Every completed CallSignal emits a set of structured facts to Hermes Honcho, keyed by fund id, founder name, sector, stage, and call date.
internal/delivery/hermes.ts — routes a CallSignal into Slack / WhatsApp / email via Hermes platform adapters. Reuses the same client.
Settings: new fields for Hermes gateway URL + per-fund delivery channel id. Backwards-compat — if unset, delivery is skipped and memory writes are a no-op (preserves the current OpenClaw-only flow for tests).

Phase 1 · memory writes during v1 (weeks)

Hook into POST /api/signal/calls. After the analyzer returns a CallSignal, fire-and-forget a memory write with facts like:
- fund:<id>/thesisAlignment/sector:<s>/stage:<t> = N
- founder:<name>/redFlagPattern/<pattern> = quote
- fund:<id>/coachingTone/<style>
- call:<id>/summary · /rating · /model · /callDate
Write errors must not block the user — Hermes is additive, not on the critical path for v1 demo.
Log memory-write failures to internal/logger; expose a health indicator in SettingsView so operators know when the memory substrate is stale.

Phase 2 · reads + UI surfacing (follow-up)

GET /api/signal/memory/suggestions?fundId=&founderName= — queries Hermes Honcho for cross-call patterns and returns ranked match candidates.
New panel in SignalView: "What else does this fund care about?" powered by memory queries — visible insight even before v2 matching ships.
Nightly cron on the Pi harness runs a memory-graph digest that gets posted into the Crowley Capital Slack channel.

Phase 3 · promotion to primary runtime (v3)

v3 Founder Prep lands. AI avatar interview writes founder-side facts into the same graph.
At this point a full OpenClaw → Hermes analyzer swap is cheap (the memory graph is already populated, the client module is already the same shape). Decide then whether to keep the hybrid or consolidate.

What this preserves

Forbes-demo safety. OpenClaw analyzer path is unchanged. If Hermes is unavailable, MatchBox degrades gracefully to the current behaviour.
Mock-gateway dev loop. Still works. Hermes writes are no-ops when the URL isn't configured.
ClawBox inheritance. Onboarding wizard, SettingsView, PluginsView all stay wired to OpenClaw. Hermes config lives in an additive settings section.
Support boundary clarity. OpenClaw bugs = client path. Hermes bugs = memory path. Keep them split during the dual-runtime phase.

Risks to watch

Memory schema drift. What you write in v1 is what v2 matching has to read. Lock a minimum schema before the first production call. Version it.
PII + confidentiality. Hermes memory now holds partner-fund + founder data. The Pi harness is on a Tailnet, but the memory layer is a new data surface — audit the storage + access paths.
Hermes gateway availability. Memory writes are fire-and-forget, but a silent back-pressure problem could cost you training data. Monitor the Hermes errors log as part of MatchBox ops.
Cost of cross-runtime calls. Analyzer stays on OpenClaw; memory on Hermes. Two gateway auth paths, two config sections, two failure modes. Document them explicitly in SettingsView.

Phase 08 · Hill-climb loops

AutoAgent × AutoResearch — two loops, both local

MatchBox uses two hill-climb loops that never leave claws-mac-mini. AutoAgent evolves the MatchBox harness itself — prompt, validator, tool chain — against a held-out benchmark of real calls. AutoResearch evolves each fund's local fine-tune as its corpus grows. Both honour the PII invariant: no transcript crosses the Pi-harness boundary.

The split in one line

AutoAgent mutates what runs (internal/signal/analyzer.ts + prompt + validators). AutoResearch mutates what gets run against (the per-fund Gemma fine-tune). Same closed loop (ρ · σ · ι · ε · κ), different target, different score function.

Composition — how the two loops compound

the two loops feed each other · everything inside the dashed border stays on claws-mac-mini ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐ │ AUTOAGENT — harness engineering loop │ │ │ ▼ │ mutates analyzer.ts · prompt · validators │ │ │ ▼ │ better CallSignal quality │ │ │ ▼ │ richer Honcho fact graph (fund · founder · sector · stage) │ │ │ ▼ │ deeper per-fund corpus at ~/.hermes/corpora/fund-<id>/ │ │ │ ▼ │ AUTORESEARCH — per-fund fine-tune loop │ │ │ ▼ │ mutates LoRA config · prompt template · hyperparameters │ │ │ ▼ │ retrained fund-<id>/ (MLX LoRA adapter) │ │ │ ▼ │ even better CallSignals for that fund │ └────────────────▶ back to AutoAgent (next round) │ │ └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘

Loop A · AutoAgent for the MatchBox harness

The MatchBox analyzer itself is an agent harness. It has a prompt, a tool surface, a validator chain. AutoAgent treats it as the mutation target and hill-climbs against a benchmark of held-out calls.

AutoAgent loop · mutate the harness, score against held-out calls program.md (human-authored directive) "MatchBox analyzer must score F1 ≥ 0.85 on held-out call set. No external inference. Red-flag quotes must be literal transcript excerpts." │ ▼ Meta-agent (Claude Code · Codex · local — configured with allowExternal=false) │ ├──▶ reads internal/signal/analyzer.ts ├──▶ reads tasks/ (held-out transcripts + expected CallSignals) ├──▶ reads results.tsv (prior rounds) │ ▼ ρ · Reflect — diagnose failure clusters in last round's trajectories │ ▼ σ · Select — propose prompt / validator / tool edits │ ▼ ι · Improve — apply edits to analyzer.ts (candidate) │ ▼ ε · Evaluate — run candidate against tasks/ via LOCAL Gemma │ score = mean F1 over (sentiment · redflag-precision · coaching-quality · fit) ▼ κ · Commit — keep if score improved, else revert │ ▼ append { round, score, patch, trajectory-diff } to results.tsv │ └──▶ loop until convergence or budget exhausted

AutoAgent setup in the MatchBox repo

program.md at repo root — the only file a human edits regularly. Directive + constraints (no external inference, preserve JSON shape, keep transcript-excerpt truncation).
tasks/ — held-out Harbor-format tasks. Each task: a transcript + an expected CallSignal shape (human-annotated rating, sentiment, red-flag categories).
results.tsv — append-only round log (score · patch · kept/discarded). Gitignored.
Harbor runner points at the local Gemma via --agent-import-path matchbox-analyzer:analyze — never reaches out to an external LLM.
Meta-agent = Claude Code or Codex running locally, configured with allowExternal=false so it can't leak transcript fragments into an external provider's context.

Loop B · AutoResearch for per-fund fine-tunes (MLX)

Each fund's LoRA adapter at ~/.hermes/models/fund-<id>/ is a living artefact. As the fund's corpus grows, the adapter gets retrained via mlx_lm.lora. AutoResearch mutates the fine-tune config (LoRA rank, target-modules, epochs, LR schedule, prompt template) via JSON Patch against a fund-scoped config, and hill-climbs against the fund's held-out eval suite.

AutoResearch loop · mutate fund-<id>/ adapter via mlx_lm.lora, score against the fund's own held-out corpus signal source: every new analysed call ──▶ ~/.hermes/corpora/fund-<id>/ │ ▼ significance check: did this fund's accuracy drift below threshold? │ ▼ yes ──▶ ρ · Reflect — inspect recent prediction errors vs ground-truth CallSignals │ ▼ σ · Select — propose fine-tune mutations (RFC 6902 JSON Patch) │ ops: replace /lora_rank · add /prompt-template/section · replace /epochs · … ▼ ι · Improve — apply patch to fund-<id>/finetune.config.json │ ▼ mlx_lm.lora --model gemma-base --train --data <fund-jsonl> --adapter-path <candidate> │ ▼ candidate adapter at fund-<id>-candidate/ │ ▼ ε · Evaluate — run candidate against the fund's held-out calls │ score = fund-scoped F1 + coaching-tone-match + red-flag precision ▼ κ · Commit — promote candidate to fund-<id>/ if better │ archive old as fund-<id>.v<N>/ for rollback │ optional: mlx_lm.fuse to bake adapter into a standalone fused model ▼ version store: local git at ~/.hermes/versions/fund-<id>/ + SQLite manifest at ~/.hermes/versions/manifest.db │ └──▶ append patch + score + model hash; never leaves the host

Why local git + SQLite instead of Cloudflare KV

The sibling GTM-autoresearch deployment uses Cloudflare KV as the version store. For MatchBox, the version store would contain fund-scoped behaviour signal derived from transcripts — still PII-adjacent. So the drift history lives on-host:

Local git repo at ~/.hermes/versions/fund-<id>/ — one repo per fund, one commit per round. Each commit contains the applied JSON Patch plus the candidate config. Replayable.
SQLite manifest at ~/.hermes/versions/manifest.db — round log (round · fund · score · patch-sha · model-sha · kept/discarded). Queryable without checking out 47 fund repos.
No Cloudflare KV for fund data. KV remains available for non-PII surfaces (the hub index, the public guide deploys) but fund-scoped training signal does not touch it.

Meta-agent sandboxing

Claude Code or Codex runs locally against the MatchBox repo. OpenClaw's allowExternal gate is set to false on every meta-agent session.
Tasks directory is gitignored. Held-out transcripts never reach the remote. If a contributor wants to share a benchmark task, they share a synthetic transcript that carries no real PII.
Trajectory logs are sanitised before they land in results.tsv. The sanitiser strips any literal transcript excerpt the meta-agent pasted into its reflection notes.
Fine-tune training runs on-host via mlx_lm.lora on the Mac mini's Metal GPU; no cloud training pipeline. Candidate evaluation and scoring also run via mlx_lm.server — the framework handles both inference and training in one Python environment.

Score functions

Loop	Score	Threshold
AutoAgent (harness)	Mean F1 over sentiment dimensions + red-flag precision + coaching-quality + fit-accuracy · evaluated on the shared held-out task set	Ship when F1 ≥ 0.85 on the held-out set, with no regression > 0.03 on any dimension
AutoResearch (fund fine-tune)	Fund-scoped F1 · coaching-tone match · red-flag precision against the fund's own held-out corpus	Promote candidate when ≥ current by ≥ 0.02, with fund-specific partner sign-off

Invariants across both loops

No transcript ever reaches an external LLM — score computation runs against the local Gemma only.
Per-fund tenancy — AutoResearch writes per fund; no fund's config sees another fund's data.
Reverse-patchable mutations — every AutoResearch patch has a canonical reverse. AutoAgent relies on git revert for rollback.
Score-gated commit — κ only promotes when the objective improves and safety invariants hold.
Evidence lives on-host — results.tsv (AutoAgent), manifest.db + per-fund git repos (AutoResearch). Both exportable as bundles if partners audit.

Why do both loops need to exist

AutoAgent alone improves the harness but leaves each fund's base Gemma undifferentiated — matching quality caps out at what the shared prompt can extract. AutoResearch alone improves per-fund models but can't fix a structurally bad analyzer. Both together give you a harness that extracts progressively sharper signal and a per-fund model that learns the fund's specific voice, so each round of each loop makes the other loop's next round easier.

What lands when

v1 SignalAutoAgent scaffolding in the repo (program.md · tasks/ · local Harbor runner). One synthetic-data round to prove the loop. No per-fund fine-tune yet.

v1 + corpus growthFirst fund crosses the 25-call threshold → AutoResearch loop triggers → first real per-fund fine-tune. AutoAgent rounds resume against the new, harder tasks.

v2 MatchingMatching quality is now a direct function of both loops' output. Hermes Honcho queries run against the per-fund models.

v3 Founder PrepAutoAgent target surface expands — the avatar-interview harness joins the analyzer as a mutation target. AutoResearch gets a third eval dimension (pre-call screening accuracy).

Phase 07 · Ops

Next steps

What's on the calendar, and what needs to land before Forbes rollout broadens.

Open commitments

Owner	Item
Jake	Meet Jon Forbes Friday — define compensation structure
Jordan	Create technical guides and development assets (this guide + wiki are the first two)
Jordan	Scaffold Hermes dual runtime — memory writes + delivery peer · start accumulating corpus before v2 matching
Team	Follow-up meeting scheduled for Wednesday or sooner
Focus	Finalize the Forbes partnership terms before broader rollout

Strategy artefacts

Granola strategy call notes — canonical source for positioning.
Airtable founder-intake form — live funnel.
Crowley Capital — seed node + reference anchor.

Follow-ups

Clean up the package.json name + repo URL (still inherited from ClawBox fork).
Rename default branch to something durable once Forbes terms are locked.
Promote Signal from operator-only to Slack delivery once the analyzer output stabilises.
Draft the v2 Matching spec after the v1 corpus has enough calls to validate the prompt.

Phase 08 · People

Stakeholders

Who's involved and what they own.

Jake

Business lead. Owns the Forbes relationship and the Friday compensation meeting. Runs founder intake through Airtable.

Jordan

Technical lead. Owns technical guides, development assets, and the Signal / Matching delivery surface.

Jon Forbes (Crowley Capital)

Primary partner. Proven software funding pipeline; wants expansion into ambiguous industries. MatchBox plugs into his existing deal flow.

Crowley Capital

Seed node for network depth. Anchor fund for the case study.

Organized AI

Parent org. Ships MatchBox alongside ClawBox, OpenClaw, and the broader market projects.

MatchBox — strike the match between founders and VCs

Product thesis

Four moves per call

listens

matches

coaches

connects

At-a-glance flow

At a glance

Strategy

Forbes-first rollout

Why Forbes first

Revenue model — in negotiation

Product Phases

v1 · Signal · shipping

v2 · Matching · next

v3 · Founder Prep · later

v1 surface (today)

What Signal is not

Signal — the v1 shipping surface

Data model

Backend routes — internal/routes/signal.ts

Per-call flow

Analyzer — internal/signal/analyzer.ts

Architecture

Repo layout

Inherited from ClawBox

What's new versus ClawBox

Agent, not platform

Surfaces

Slack

WhatsApp

Email

Airtable form

Delivery fanout

Network effects

Dev

Install + run

OpenClaw for the analyzer

Build + verify

Repo hygiene

Branch note

OpenClaw + Hermes — dual runtime from v1

The only real coupling

Dual-runtime plan (adopted)

OpenClaw · analyzer

Hermes · memory + delivery

Hermes pros

Hermes cons

OpenClaw pros (reasons to stay)

OpenClaw cons

Phased recommendation (memory-first)

PII policy — per-fund local fine-tunes

Fine-tune pipeline (MLX)

What the analyzer provider module looks like

Policy invariants (enforced in code)

Memory write + query

Integration plan — concrete file map

Phase 0 · scaffolding (days)

Phase 1 · memory writes during v1 (weeks)

Phase 2 · reads + UI surfacing (follow-up)

Phase 3 · promotion to primary runtime (v3)

What this preserves

Risks to watch

AutoAgent × AutoResearch — two loops, both local

Composition — how the two loops compound

Loop A · AutoAgent for the MatchBox harness

AutoAgent setup in the MatchBox repo

Loop B · AutoResearch for per-fund fine-tunes (MLX)

Why local git + SQLite instead of Cloudflare KV

Meta-agent sandboxing

Score functions

Invariants across both loops

What lands when

Next steps

Open commitments

Strategy artefacts

Follow-ups

Stakeholders

Jake

Backend routes — `internal/routes/signal.ts`

Analyzer — `internal/signal/analyzer.ts`