Under the hood

How Cruma assembles context.

Most AI tools feel different. Mechanically, they're all built the same way. Cruma is built different. Here's exactly how — with no marketing varnish.

Architecture Cruma · 2026 ~8 min read

Most people have never actually compared how an AI tool assembles the context it uses to answer them. You upload some files, you write a system prompt, you start chatting, and you assume the rest is magic.

It isn't magic. It's a specific architecture. And every major AI product — Claude Projects, ChatGPT Custom GPTs, Copilot, Notion AI — uses essentially the same one. Cruma uses a different one. That's the entire reason Cruma feels like it knows your business when other AI doesn't.

This page is for people who want to see mechanically what's actually different. No analogies, no AI-explainer hand-waving. Just the architecture.

Same model on top. Very different substrate underneath.

01 · How Claude assembles context

The standard pattern: documents → chat.

Every Anthropic product — Claude.ai, Claude Projects, Claude for Work, Claude Code — follows the same fundamental shape. So does ChatGPT. So does Copilot. So does Notion AI. Each varies at the edges, but the substrate is identical.

Claude Projects / Claude for Work

Claude.ai consumer + Memory

Claude Code

MCP (Model Context Protocol)

The common shape across all of these: documents you wrote, surfaced into a chat at conversation time. No typed state. No live sync from your actual tools. No persistent reasoning across conversations. No learning loop in your workspace.

┌─────────────────────────────────────────────────────────────────────┐
│                    CLAUDE PROJECT / CLAUDE FOR WORK                 │
│                                                                     │
│   ┌──────────────┐                                                  │
│   │  YOUR PDFs   │ ──┐                                              │
│   │  YOUR DOCS   │   │                                              │
│   │  SYS PROMPT  │   │ surfaced into chat                           │
│   └──────────────┘   │ at conversation time                         │
│                      ▼                                              │
│                  ┌─────────┐         ┌──────────┐                   │
│   user msg ────▶│ CLAUDE   │────────▶│  reply   │                   │
│                  └─────────┘         └──────────┘                   │
│                                                                     │
│   [no live sync · no typed state · no learning loop ·               │
│    chat-scoped context · doc-shaped retrieval]                      │
└─────────────────────────────────────────────────────────────────────┘

This is fine for an assistant. It's not fine for a system that's supposed to run your business while you sleep.

02 · How Cruma assembles context

The Cruma pattern: state → skill → action → learn.

Cruma's substrate is different in every layer that matters. Let me walk through it without shortcuts.

Storage: typed relational state

Update model: continuous ingestion

Retrieval: the canonical brief

Execution: deterministic skills with agentic routing

Learning: failure-driven recursion

┌─────────────────────────────────────────────────────────────────────┐
│                              CRUMA                                  │
│                                                                     │
│   ┌──────────────────────────────────────────────────────────┐      │
│   │   LIVE INGESTION                                         │      │
│   │   Gmail · HubSpot · Notion · GitHub · Stripe · Drive ────┼─┐    │
│   └──────────────────────────────────────────────────────────┘ │    │
│                                                                ▼    │
│   ┌─────────────────────────────────────────────────────────────┐   │
│   │  TYPED STATE                          VECTOR MEMORY         │   │
│   │  offers · icps · voice_profiles ──┬── business_memory       │   │
│   │  target_accounts · evidence       │   (pgvector)            │   │
│   │  action_records · claims          │                         │   │
│   └─────────────────────────────────────────────────────────────┘   │
│                       │                                             │
│                       ▼                                             │
│   ┌──────────────────────────────┐                                  │
│   │  get_business_brief()        │  ◀── deterministic synthesizer  │
│   │  prompt-cached (90% discount)│      called by every skill      │
│   └──────────────────────────────┘                                  │
│                       │                                             │
│           ┌───────────┴────────────┐                                │
│           ▼                        ▼                                │
│   ┌──────────────┐         ┌──────────────┐                         │
│   │   SKILL      │ ──────▶│ CLAUDE (LLM) │ ──▶ output + evidence   │
│   │   (typed I/O)│         │              │     chips               │
│   └──────────────┘         └──────────────┘                         │
│           │                                                         │
│           │ logged to action_records                                │
│           ▼                                                         │
│   ┌──────────────────────────┐                                      │
│   │  Failure-driven loop     │  ◀── skills improve monotonically   │
│   │  Cross-skill validation  │      in YOUR workspace               │
│   └──────────────────────────┘                                      │
│                                                                     │
│   [live sync · typed state · evidence ledger · learning loop ·      │
│    workspace-scoped context · skill-shaped retrieval ·              │
│    every call starts with canon]                                    │
└─────────────────────────────────────────────────────────────────────┘

The shape that matters: Cruma's context isn't surfaced at conversation time — it's pre-shaped, kept alive, and injected canonically into every skill invocation.

03 · Same query, two systems

Watch the difference on one real ask.

To make this concrete, run the same query through both systems.

The query

"Draft a follow-up to Maya at Acme — she went quiet 11 days ago."

In Claude Project

  1. Claude reads conversation history + currently-attached files (system prompt + your uploaded PDFs).
  2. Looks for "Maya," "Acme" across attached files. Maybe finds a CRM export PDF you uploaded last month.
  3. Generates a draft based on what's in the project files + general writing skill.
  4. Doesn't know: what the actual most recent email thread says (unless you paste it), what your past follow-up patterns look like, whether your voice has shifted, whether similar deals closed or stalled. No live HubSpot data unless you re-export and re-upload.

In Cruma

  1. Skill chase_stalled_deal is invoked — either by you or by the orchestrator detecting "no reply on Acme thread, day 11."
  2. get_business_brief() runs first — pulls offer + ICP + voice_profile + recent won/lost patterns. Cached read.
  3. business_memory vector retrieval pulls the last 5 won-back deals that resembled this one — what worked, what didn't.
  4. target_accounts + evidence ledger pull: full Acme history, the actual email thread (synced live from Gmail), past pricing offered, past objections logged.
  5. voice_profiles feeds calibration — how you actually write follow-ups, not a SaaS template.
  6. Claude (the underlying LLM) gets all of this as structured context and generates a draft.
  7. The draft surfaces with evidence chips: "Maya bought day 11. Policy applies. Mirrors past stalls."
  8. If you approve, voice_profile weights tighten. If you refuse, the failure becomes a test case for chase_stalled_deal next revision.

Same model. Same question. Wildly different substrate. The Claude Project version is a competent writer with a stale folder. The Cruma version is a system that knows your business, has the actual thread, and gets sharper every time you weigh in.

04 · Side by side

The mechanical comparison, line by line.

Claude Projects · ChatGPT · Copilot · Notion AI
Documents → chat
  • Storage: flat blob of uploaded files + system prompt
  • Update: manual re-upload
  • Retrieval: document chunks at chat time
  • Granularity: whole documents
  • Persistence: conversation resets; project files persist
  • External sync: none built-in
  • Multi-user: shared files, no audit trail
  • Learning loop: none
  • Evidence trail: none
  • Voice calibration: stays at whatever you uploaded
Cruma
State → skill → action → learn
  • Storage: typed relational state + vector memory + evidence ledger
  • Update: continuous ingestion from your live tools
  • Retrieval: canonical brief, prompt-cached, called by every skill
  • Granularity: typed atoms (the offer, the ICP, the pattern)
  • Persistence: brain state persists across every interaction and surface
  • External sync: Gmail, HubSpot, Notion, GitHub, Stripe, Drive, calendar, Slack
  • Multi-user: workspace roles + RLS + full audit trail
  • Learning loop: failure-driven, per workspace, monotonic
  • Evidence trail: every claim carries source + confidence
  • Voice calibration: tightens with every approval and refuse
05 · What this means in practice

Why the substrate shows up in the outputs.

The architecture isn't trivia. It directly produces the behaviors that make Cruma feel different in use.

Claude assembles context from documents.
Cruma assembles context from state.