Most people have never actually compared how an AI tool assembles the context it uses to answer them. You upload some files, you write a system prompt, you start chatting, and you assume the rest is magic.
It isn't magic. It's a specific architecture. And every major AI product — Claude Projects, ChatGPT Custom GPTs, Copilot, Notion AI — uses essentially the same one. Cruma uses a different one. That's the entire reason Cruma feels like it knows your business when other AI doesn't.
This page is for people who want to see mechanically what's actually different. No analogies, no AI-explainer hand-waving. Just the architecture.
Same model on top. Very different substrate underneath.
The standard pattern: documents → chat.
Every Anthropic product — Claude.ai, Claude Projects, Claude for Work, Claude Code — follows the same fundamental shape. So does ChatGPT. So does Copilot. So does Notion AI. Each varies at the edges, but the substrate is identical.
Claude Projects / Claude for Work
- Storage: a flat collection of files you upload (PDFs, docs, code) plus a free-text system prompt. Blob-shaped.
- Update model: manual. When your business changes, you re-upload. Files go stale between uploads.
- Retrieval: files are surfaced into the context window per conversation, with some chunking and matching. No typed retrieval — it's "find the doc that mentions this."
- Granularity: whole documents. There's no "the current ICP" or "the won/lost pattern for deals over $50K" as a queryable atom — it's a paragraph in a file you wrote.
- Persistence: project files persist. Conversation state resets each chat. Last chat doesn't inform next chat.
- External sync: none. If HubSpot changed yesterday, Claude doesn't know unless you re-export.
- Multi-user: shared files, but no audited "this is what our company believes" layer.
- Self-improvement: none. Your edits and rejections teach the project nothing for next time.
Claude.ai consumer + Memory
- Conversation-scoped context window. The model sees your message + recent history.
- Memory feature saves notes about you across conversations (manual or auto). Lives in user profile.
- Optional web search at query time when enabled.
- No business state. No integrations. No skills. No audit.
Claude Code
- File-system-based context.
CLAUDE.mdfiles at project root + nested folders. - Pull-based retrieval: the agent reads files on demand via Read / Grep / Glob tools.
- Git status, diff, and log as ambient context.
- Session-scoped — each session is fresh. Persistence comes from files + CLAUDE.md you commit to your repo.
MCP (Model Context Protocol)
- External tool and resource exposure to Claude (or any compatible client).
- Context becomes pull-based at inference time — "ask the MCP server when you need it."
- MCP itself doesn't organize the context. It's a transport protocol. The organizing happens inside whatever server you connect — which is exactly the slot Cruma fills.
The common shape across all of these: documents you wrote, surfaced into a chat at conversation time. No typed state. No live sync from your actual tools. No persistent reasoning across conversations. No learning loop in your workspace.
┌─────────────────────────────────────────────────────────────────────┐ │ CLAUDE PROJECT / CLAUDE FOR WORK │ │ │ │ ┌──────────────┐ │ │ │ YOUR PDFs │ ──┐ │ │ │ YOUR DOCS │ │ │ │ │ SYS PROMPT │ │ surfaced into chat │ │ └──────────────┘ │ at conversation time │ │ ▼ │ │ ┌─────────┐ ┌──────────┐ │ │ user msg ────▶│ CLAUDE │────────▶│ reply │ │ │ └─────────┘ └──────────┘ │ │ │ │ [no live sync · no typed state · no learning loop · │ │ chat-scoped context · doc-shaped retrieval] │ └─────────────────────────────────────────────────────────────────────┘
This is fine for an assistant. It's not fine for a system that's supposed to run your business while you sleep.
The Cruma pattern: state → skill → action → learn.
Cruma's substrate is different in every layer that matters. Let me walk through it without shortcuts.
Storage: typed relational state
- Typed tables in Postgres:
offers,icps,voice_profiles,value_props,target_accounts,won_lost_patterns,sales_stages. Each is a proper relational table with foreign keys — not a JSON blob, not a folder of files. - Vector memory (
business_memory) for unstructured signals: founder posts, customer quotes, past hooks, lessons learned. Queried by embedding similarity, scoped per workspace. - Evidence ledger (
entities/observations/claims/signals). Every fact carriessource_url+observed_at+confidence+provider. No claim without a clickable receipt.
Update model: continuous ingestion
- Live sync from Gmail, HubSpot, Notion, GitHub, Stripe, OneDrive, calendar, Slack — flowing into typed state and vector memory.
- Every approval, edit, refuse in the system becomes a signal that updates voice profile, policy weights, and skill behavior.
- The brain doesn't go stale because the brain is alive.
Retrieval: the canonical brief
- A single function:
get_business_brief(workspace_id). It pulls from typed state + recent vector memory and synthesizes a 200-300 word "strategist's brief" — who you are, what you sell, who to, how you win, where you are, what's worked. - Every skill invocation calls this brief first. That's the architectural commitment. Every LLM call in the system starts with the same canon-anchored context. No skill reasons in a vacuum.
- The brief is prompt-cached using Anthropic's 90% read discount — so the cost of "every skill starts with full business context" stays near zero per call.
Execution: deterministic skills with agentic routing
- Skills are deterministic workflows — explicit DAGs of typed inputs, known nodes, LLM calls at predetermined points, typed outputs. The narrow agentic surface (which skill, with what args) sits above; the inside of each skill is deterministic.
- Every skill touches a real external effector (Gmail, HubSpot, Stripe, etc.). No "logic only" skills.
- Cross-skill validation: when a skill produces output that contradicts canon (e.g. ICP says enterprise CTOs, voice profile is founder-to-founder), the system flags it as a soft failure for review — caught before it ships.
Learning: failure-driven recursion
- Every skill invocation logged to
action_recordswith full structured context: inputs, outputs, success/failure, LLM calls, latency, cost. - Hard failures (exceptions, timeouts, malformed output) and soft failures (output below quality threshold, validation rule violated, expected signal missing) both trigger the learning loop.
- Failing inputs become test cases. Updated skills must pass them. Skills improve monotonically.
- This loop runs per workspace. The same skill in two different customers' workspaces evolves differently — shaped by the actual edits, approvals, and refusals in each.
┌─────────────────────────────────────────────────────────────────────┐ │ CRUMA │ │ │ │ ┌──────────────────────────────────────────────────────────┐ │ │ │ LIVE INGESTION │ │ │ │ Gmail · HubSpot · Notion · GitHub · Stripe · Drive ────┼─┐ │ │ └──────────────────────────────────────────────────────────┘ │ │ │ ▼ │ │ ┌─────────────────────────────────────────────────────────────┐ │ │ │ TYPED STATE VECTOR MEMORY │ │ │ │ offers · icps · voice_profiles ──┬── business_memory │ │ │ │ target_accounts · evidence │ (pgvector) │ │ │ │ action_records · claims │ │ │ │ └─────────────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌──────────────────────────────┐ │ │ │ get_business_brief() │ ◀── deterministic synthesizer │ │ │ prompt-cached (90% discount)│ called by every skill │ │ └──────────────────────────────┘ │ │ │ │ │ ┌───────────┴────────────┐ │ │ ▼ ▼ │ │ ┌──────────────┐ ┌──────────────┐ │ │ │ SKILL │ ──────▶│ CLAUDE (LLM) │ ──▶ output + evidence │ │ │ (typed I/O)│ │ │ chips │ │ └──────────────┘ └──────────────┘ │ │ │ │ │ │ logged to action_records │ │ ▼ │ │ ┌──────────────────────────┐ │ │ │ Failure-driven loop │ ◀── skills improve monotonically │ │ │ Cross-skill validation │ in YOUR workspace │ │ └──────────────────────────┘ │ │ │ │ [live sync · typed state · evidence ledger · learning loop · │ │ workspace-scoped context · skill-shaped retrieval · │ │ every call starts with canon] │ └─────────────────────────────────────────────────────────────────────┘
The shape that matters: Cruma's context isn't surfaced at conversation time — it's pre-shaped, kept alive, and injected canonically into every skill invocation.
Watch the difference on one real ask.
To make this concrete, run the same query through both systems.
"Draft a follow-up to Maya at Acme — she went quiet 11 days ago."
In Claude Project
- Claude reads conversation history + currently-attached files (system prompt + your uploaded PDFs).
- Looks for "Maya," "Acme" across attached files. Maybe finds a CRM export PDF you uploaded last month.
- Generates a draft based on what's in the project files + general writing skill.
- Doesn't know: what the actual most recent email thread says (unless you paste it), what your past follow-up patterns look like, whether your voice has shifted, whether similar deals closed or stalled. No live HubSpot data unless you re-export and re-upload.
In Cruma
- Skill
chase_stalled_dealis invoked — either by you or by the orchestrator detecting "no reply on Acme thread, day 11." get_business_brief()runs first — pulls offer + ICP + voice_profile + recent won/lost patterns. Cached read.business_memoryvector retrieval pulls the last 5 won-back deals that resembled this one — what worked, what didn't.target_accounts+ evidence ledger pull: full Acme history, the actual email thread (synced live from Gmail), past pricing offered, past objections logged.voice_profilesfeeds calibration — how you actually write follow-ups, not a SaaS template.- Claude (the underlying LLM) gets all of this as structured context and generates a draft.
- The draft surfaces with evidence chips: "Maya bought day 11. Policy applies. Mirrors past stalls."
- If you approve,
voice_profileweights tighten. If you refuse, the failure becomes a test case forchase_stalled_dealnext revision.
Same model. Same question. Wildly different substrate. The Claude Project version is a competent writer with a stale folder. The Cruma version is a system that knows your business, has the actual thread, and gets sharper every time you weigh in.
The mechanical comparison, line by line.
- Storage: flat blob of uploaded files + system prompt
- Update: manual re-upload
- Retrieval: document chunks at chat time
- Granularity: whole documents
- Persistence: conversation resets; project files persist
- External sync: none built-in
- Multi-user: shared files, no audit trail
- Learning loop: none
- Evidence trail: none
- Voice calibration: stays at whatever you uploaded
- Storage: typed relational state + vector memory + evidence ledger
- Update: continuous ingestion from your live tools
- Retrieval: canonical brief, prompt-cached, called by every skill
- Granularity: typed atoms (the offer, the ICP, the pattern)
- Persistence: brain state persists across every interaction and surface
- External sync: Gmail, HubSpot, Notion, GitHub, Stripe, Drive, calendar, Slack
- Multi-user: workspace roles + RLS + full audit trail
- Learning loop: failure-driven, per workspace, monotonic
- Evidence trail: every claim carries source + confidence
- Voice calibration: tightens with every approval and refuse
Why the substrate shows up in the outputs.
The architecture isn't trivia. It directly produces the behaviors that make Cruma feel different in use.
- Drafts in your voice, not a SaaS template. Because
voice_profileis a calibrated atom in the brief, not a paragraph in a PDF you wrote once. - No re-explaining your business. Because
get_business_brief()already ran. The system isn't asking "who are you again" — it's already standing in your shoes. - Self-improvement that actually compounds. Because every approval, edit, and refuse feeds
action_recordsand the failure-driven loop. Skills sharpen in your workspace specifically — not in some generic global model. - Evidence behind every action. Because the evidence ledger is built-in, not bolted on. Every draft carries "why this draft —" with sources. You can audit Cruma the way you'd audit a junior employee.
- Model-agnostic. Because the brain is upstream of the model. Plug Claude in, plug ChatGPT in, plug a custom agent in — the brain stays the same. You can't get stuck on a model that's no longer the best.
- Compounds with every model upgrade. Because the brain is structured to be consumed by whatever LLM is current. Smarter Claude next year? Same brain, more useful work. You're not betting on a model — you're betting on the layer underneath.
Claude assembles context from documents.
Cruma assembles context from state.