How Cruma assembles context

Most people have never actually compared how an AI tool assembles the context it uses to answer them. You upload some files, you write a system prompt, you start chatting, and you assume the rest is magic.

It isn't magic. It's a specific architecture. And every major AI product — Claude Projects, ChatGPT Custom GPTs, Copilot, Notion AI — uses essentially the same one. Cruma uses a different one. That's the entire reason Cruma feels like it knows your business when other AI doesn't.

This page is for people who want to see mechanically what's actually different. No analogies, no AI-explainer hand-waving. Just the architecture.

Same model on top. Very different substrate underneath.

01 · How Claude assembles context

The standard pattern: documents → chat.

Every Anthropic product — Claude.ai, Claude Projects, Claude for Work, Claude Code — follows the same fundamental shape. So does ChatGPT. So does Copilot. So does Notion AI. Each varies at the edges, but the substrate is identical.

Claude Projects / Claude for Work

Storage: a flat collection of files you upload (PDFs, docs, code) plus a free-text system prompt. Blob-shaped.
Update model: manual. When your business changes, you re-upload. Files go stale between uploads.
Retrieval: files are surfaced into the context window per conversation, with some chunking and matching. No typed retrieval — it's "find the doc that mentions this."
Granularity: whole documents. There's no "the current ICP" or "the won/lost pattern for deals over $50K" as a queryable atom — it's a paragraph in a file you wrote.
Persistence: project files persist. Conversation state resets each chat. Last chat doesn't inform next chat.
External sync: none. If HubSpot changed yesterday, Claude doesn't know unless you re-export.
Multi-user: shared files, but no audited "this is what our company believes" layer.
Self-improvement: none. Your edits and rejections teach the project nothing for next time.

Claude.ai consumer + Memory

Conversation-scoped context window. The model sees your message + recent history.
Memory feature saves notes about you across conversations (manual or auto). Lives in user profile.
Optional web search at query time when enabled.
No business state. No integrations. No skills. No audit.

Claude Code

File-system-based context. CLAUDE.md files at project root + nested folders.
Pull-based retrieval: the agent reads files on demand via Read / Grep / Glob tools.
Git status, diff, and log as ambient context.
Session-scoped — each session is fresh. Persistence comes from files + CLAUDE.md you commit to your repo.

MCP (Model Context Protocol)

External tool and resource exposure to Claude (or any compatible client).
Context becomes pull-based at inference time — "ask the MCP server when you need it."
MCP itself doesn't organize the context. It's a transport protocol. The organizing happens inside whatever server you connect — which is exactly the slot Cruma fills.

The common shape across all of these: documents you wrote, surfaced into a chat at conversation time. No typed state. No live sync from your actual tools. No persistent reasoning across conversations. No learning loop in your workspace.

┌─────────────────────────────────────────────────────────────────────┐
│                    CLAUDE PROJECT / CLAUDE FOR WORK                 │
│                                                                     │
│   ┌──────────────┐                                                  │
│   │  YOUR PDFs   │ ──┐                                              │
│   │  YOUR DOCS   │   │                                              │
│   │  SYS PROMPT  │   │ surfaced into chat                           │
│   └──────────────┘   │ at conversation time                         │
│                      ▼                                              │
│                  ┌─────────┐         ┌──────────┐                   │
│   user msg ────▶│ CLAUDE   │────────▶│  reply   │                   │
│                  └─────────┘         └──────────┘                   │
│                                                                     │
│   [no live sync · no typed state · no learning loop ·               │
│    chat-scoped context · doc-shaped retrieval]                      │
└─────────────────────────────────────────────────────────────────────┘

This is fine for an assistant. It's not fine for a system that's supposed to run your business while you sleep.

02 · How Cruma assembles context

The Cruma pattern: state → skill → action → learn.

Cruma's substrate is different in every layer that matters. Let me walk through it without shortcuts.

Storage: typed relational state

Typed tables in Postgres: offers, icps, voice_profiles, value_props, target_accounts, won_lost_patterns, sales_stages. Each is a proper relational table with foreign keys — not a JSON blob, not a folder of files.
Vector memory (business_memory) for unstructured signals: founder posts, customer quotes, past hooks, lessons learned. Queried by embedding similarity, scoped per workspace.
Evidence ledger (entities / observations / claims / signals). Every fact carries source_url + observed_at + confidence + provider. No claim without a clickable receipt.

Update model: continuous ingestion

Live sync from Gmail, HubSpot, Notion, GitHub, Stripe, OneDrive, calendar, Slack — flowing into typed state and vector memory.
Every approval, edit, refuse in the system becomes a signal that updates voice profile, policy weights, and skill behavior.
The brain doesn't go stale because the brain is alive.

Retrieval: the canonical brief

A single function: get_business_brief(workspace_id). It pulls from typed state + recent vector memory and synthesizes a 200-300 word "strategist's brief" — who you are, what you sell, who to, how you win, where you are, what's worked.
Every skill invocation calls this brief first. That's the architectural commitment. Every LLM call in the system starts with the same canon-anchored context. No skill reasons in a vacuum.
The brief is prompt-cached using Anthropic's 90% read discount — so the cost of "every skill starts with full business context" stays near zero per call.

Execution: deterministic skills with agentic routing

Skills are deterministic workflows — explicit DAGs of typed inputs, known nodes, LLM calls at predetermined points, typed outputs. The narrow agentic surface (which skill, with what args) sits above; the inside of each skill is deterministic.
Every skill touches a real external effector (Gmail, HubSpot, Stripe, etc.). No "logic only" skills.
Cross-skill validation: when a skill produces output that contradicts canon (e.g. ICP says enterprise CTOs, voice profile is founder-to-founder), the system flags it as a soft failure for review — caught before it ships.

Learning: failure-driven recursion

Every skill invocation logged to action_records with full structured context: inputs, outputs, success/failure, LLM calls, latency, cost.
Hard failures (exceptions, timeouts, malformed output) and soft failures (output below quality threshold, validation rule violated, expected signal missing) both trigger the learning loop.
Failing inputs become test cases. Updated skills must pass them. Skills improve monotonically.
This loop runs per workspace. The same skill in two different customers' workspaces evolves differently — shaped by the actual edits, approvals, and refusals in each.

┌─────────────────────────────────────────────────────────────────────┐
│                              CRUMA                                  │
│                                                                     │
│   ┌──────────────────────────────────────────────────────────┐      │
│   │   LIVE INGESTION                                         │      │
│   │   Gmail · HubSpot · Notion · GitHub · Stripe · Drive ────┼─┐    │
│   └──────────────────────────────────────────────────────────┘ │    │
│                                                                ▼    │
│   ┌─────────────────────────────────────────────────────────────┐   │
│   │  TYPED STATE                          VECTOR MEMORY         │   │
│   │  offers · icps · voice_profiles ──┬── business_memory       │   │
│   │  target_accounts · evidence       │   (pgvector)            │   │
│   │  action_records · claims          │                         │   │
│   └─────────────────────────────────────────────────────────────┘   │
│                       │                                             │
│                       ▼                                             │
│   ┌──────────────────────────────┐                                  │
│   │  get_business_brief()        │  ◀── deterministic synthesizer  │
│   │  prompt-cached (90% discount)│      called by every skill      │
│   └──────────────────────────────┘                                  │
│                       │                                             │
│           ┌───────────┴────────────┐                                │
│           ▼                        ▼                                │
│   ┌──────────────┐         ┌──────────────┐                         │
│   │   SKILL      │ ──────▶│ CLAUDE (LLM) │ ──▶ output + evidence   │
│   │   (typed I/O)│         │              │     chips               │
│   └──────────────┘         └──────────────┘                         │
│           │                                                         │
│           │ logged to action_records                                │
│           ▼                                                         │
│   ┌──────────────────────────┐                                      │
│   │  Failure-driven loop     │  ◀── skills improve monotonically   │
│   │  Cross-skill validation  │      in YOUR workspace               │
│   └──────────────────────────┘                                      │
│                                                                     │
│   [live sync · typed state · evidence ledger · learning loop ·      │
│    workspace-scoped context · skill-shaped retrieval ·              │
│    every call starts with canon]                                    │
└─────────────────────────────────────────────────────────────────────┘

The shape that matters: Cruma's context isn't surfaced at conversation time — it's pre-shaped, kept alive, and injected canonically into every skill invocation.

03 · Same query, two systems

Watch the difference on one real ask.

To make this concrete, run the same query through both systems.

The query

"Draft a follow-up to Maya at Acme — she went quiet 11 days ago."

In Claude Project

Claude reads conversation history + currently-attached files (system prompt + your uploaded PDFs).
Looks for "Maya," "Acme" across attached files. Maybe finds a CRM export PDF you uploaded last month.
Generates a draft based on what's in the project files + general writing skill.
Doesn't know: what the actual most recent email thread says (unless you paste it), what your past follow-up patterns look like, whether your voice has shifted, whether similar deals closed or stalled. No live HubSpot data unless you re-export and re-upload.

In Cruma

Skill chase_stalled_deal is invoked — either by you or by the orchestrator detecting "no reply on Acme thread, day 11."
get_business_brief() runs first — pulls offer + ICP + voice_profile + recent won/lost patterns. Cached read.
business_memory vector retrieval pulls the last 5 won-back deals that resembled this one — what worked, what didn't.
target_accounts + evidence ledger pull: full Acme history, the actual email thread (synced live from Gmail), past pricing offered, past objections logged.
voice_profiles feeds calibration — how you actually write follow-ups, not a SaaS template.
Claude (the underlying LLM) gets all of this as structured context and generates a draft.
The draft surfaces with evidence chips: "Maya bought day 11. Policy applies. Mirrors past stalls."
If you approve, voice_profile weights tighten. If you refuse, the failure becomes a test case for chase_stalled_deal next revision.

Same model. Same question. Wildly different substrate. The Claude Project version is a competent writer with a stale folder. The Cruma version is a system that knows your business, has the actual thread, and gets sharper every time you weigh in.

04 · Side by side

The mechanical comparison, line by line.

Claude Projects · ChatGPT · Copilot · Notion AI

Documents → chat

Storage: flat blob of uploaded files + system prompt
Update: manual re-upload
Retrieval: document chunks at chat time
Granularity: whole documents
Persistence: conversation resets; project files persist
External sync: none built-in
Multi-user: shared files, no audit trail
Learning loop: none
Evidence trail: none
Voice calibration: stays at whatever you uploaded

Cruma

State → skill → action → learn

Storage: typed relational state + vector memory + evidence ledger
Update: continuous ingestion from your live tools
Retrieval: canonical brief, prompt-cached, called by every skill
Granularity: typed atoms (the offer, the ICP, the pattern)
Persistence: brain state persists across every interaction and surface
External sync: Gmail, HubSpot, Notion, GitHub, Stripe, Drive, calendar, Slack
Multi-user: workspace roles + RLS + full audit trail
Learning loop: failure-driven, per workspace, monotonic
Evidence trail: every claim carries source + confidence
Voice calibration: tightens with every approval and refuse

05 · What this means in practice

Why the substrate shows up in the outputs.

The architecture isn't trivia. It directly produces the behaviors that make Cruma feel different in use.

Drafts in your voice, not a SaaS template. Because voice_profile is a calibrated atom in the brief, not a paragraph in a PDF you wrote once.
No re-explaining your business. Because get_business_brief() already ran. The system isn't asking "who are you again" — it's already standing in your shoes.
Self-improvement that actually compounds. Because every approval, edit, and refuse feeds action_records and the failure-driven loop. Skills sharpen in your workspace specifically — not in some generic global model.
Evidence behind every action. Because the evidence ledger is built-in, not bolted on. Every draft carries "why this draft —" with sources. You can audit Cruma the way you'd audit a junior employee.
Model-agnostic. Because the brain is upstream of the model. Plug Claude in, plug ChatGPT in, plug a custom agent in — the brain stays the same. You can't get stuck on a model that's no longer the best.
Compounds with every model upgrade. Because the brain is structured to be consumed by whatever LLM is current. Smarter Claude next year? Same brain, more useful work. You're not betting on a model — you're betting on the layer underneath.

Claude assembles context from documents.
Cruma assembles context from state.

Claim your seat → Read the manifesto

How Cruma assembles context.