Decompose — The Missing Cognitive Primitive for AI Agents

Before / After

Real output from the MCP Transport Specification.

What your agent sees today

Your agent gets this blob. It has to read every word to figure out what matters. ## Overview MCP provides a standardized way for applications to share contextual information with language models, expose tools and capabilities to AI systems, and build composable integrations. ## stdio Transport Messages are delimited by newlines and MUST NOT contain embedded newlines. The server MUST NOT write anything to stdout that is not a valid MCP message. The client MUST NOT write anything to the server stdin that is not a valid message. ## Security Warning Servers MUST validate the Origin header on all incoming connections to prevent DNS rebinding attacks. Servers SHOULD bind only to localhost. Servers SHOULD implement proper authentication. Without these protections, attackers could use DNS rebinding to interact with local MCP servers. ## Implementation Guidelines Implementors SHOULD build robust consent and authorization flows. They should provide clear documentation and implement appropriate access controls.

What decompose returns

4 units. Your agent reads one. Unit 1 — Overview attention: 0.0 authority: informational risk: informational type: narrative → skip Unit 2 — stdio Transport attention: 0.9 authority: prohibitive risk: informational type: requirement → low priority Unit 3 — Security Warning attention: 4.5 authority: directive risk: security type: requirement actionable: true → read this Unit 4 — Implementation Guidelines attention: 0.4 authority: directive risk: informational type: informational → low priority

attention — 0–10 priority score. Unit 3 scores 4.5 (security risk + directive authority). The overview scores 0.0. Your agent knows where to focus.

risk: security — "attackers", "DNS rebinding", "authentication" trigger security risk detection. The overview and guidelines carry no risk signal.

authority — MUST/MUST NOT = mandatory/prohibitive. SHOULD = directive. Plain prose = informational. Your agent knows what's binding vs. advisory.

actionable — Unit 3 requires action: validate Origin headers, bind to localhost, implement auth. The overview requires nothing.

source — This is real output. Run it yourself: curl spec.modelcontextprotocol.io | python -m decompose

Try it now Talk to us

What it does

Every text becomes structured intelligence.

Classify

Authority Detection

Mandatory, prohibitive, directive, permissive, informational, conditional. Knows the difference between "shall" and "should" and "may."

Classify

Risk Scoring

Safety-critical, compliance, financial, contractual, advisory. Each chunk gets scored and labeled by risk category.

Extract

Entity Recognition

Standards, dates, dollar amounts, percentages. Deterministic regex. No hallucinations. No API calls.

Analyze

Irreducibility

Detects content that must be preserved verbatim — legal mandates, threshold values, safety limits. Tells your model what it cannot summarize.

Chunk

Semantic Splitting

Header-aware Markdown chunking. Sentence-boundary text splitting. Each chunk preserves its heading path and structural context.

Score

Attention Budget

Every unit gets an attention score from 0–10. Your agent knows which chunks matter most without reading all of them.

How it trains your model

Every unit is a training signal.

Decompose doesn't just help agents read — it produces the structured labels that make models smarter over time.

Raw document (what you have)

Anthropic's CoT prompting guide — 9,782 chars of prose, examples, and financial calculations. A model sees undifferentiated text. ### Why let Claude think? - **Accuracy:** Stepping through problems reduces errors, especially in math, logic, analysis, or generally complex tasks. - **Coherence:** Structured thinking leads to more cohesive, well-organized responses. ### Examples Best case (12% annually): $10,000 * (1.12)^5 = $17,623.42 Worst case (market crash): Could lose 50% = $5,000 Guaranteed: $10,000 * (1.06)^5 = $13,382.25 I recommend Option B, the bond with a guaranteed 6% annual return.

Decomposed (what your model learns from)

10 units. Each one is a labeled training sample. Unit 3 — Why let Claude think? authority: permissive risk: informational attention: 0.1 irreducible: false → SUMMARIZABLE → model can paraphrase this freely Unit 7 — Examples (financial analysis) authority: informational risk: financial attention: 1.5 irreducible: true → PRESERVE_VERBATIM financial: $10,000 $17,623.42 $13,382.25 → model must never alter these numbers

irreducible: true — The financial calculations ($10,000 × 1.06^5 = $13,382.25) contain exact values. A model trained on this label learns: never paraphrase dollar amounts, formulas, or threshold values. This is how you prevent hallucinated numbers.

irreducible: false — The "Why let Claude think?" section is advisory prose. A model trained on this label learns: safe to summarize, reword, or compress. This is how you save tokens without losing meaning.

risk: financial — Decompose detected dollar amounts and investment calculations. A model fine-tuned on these labels learns to flag financial content for human review — even when the surrounding text looks like a tutorial.

attention score — Unit 7 scores 1.5 (financial risk multiplier). Unit 3 scores 0.1 (permissive + informational). When building RAG or curriculum-weighted training, attention tells you which samples to oversample and which to skip.

Training

Curriculum Filtering

Use attention scores to weight training samples. High-attention units get oversampled. Informational filler gets downsampled. Your model learns to prioritize what matters.

Training

Instruction Tuning Pairs

Each unit is a natural (input, label) pair. Input: the raw text. Labels: authority, risk, actionable, irreducible. Fine-tune a model to classify documents the way decompose does.

Training

Irreducibility Guardrails

Units flagged PRESERVE_VERBATIM teach the model which content it must never paraphrase — exact figures, legal mandates, threshold values. This is how you stop hallucinated numbers.

Inference

Attention-Weighted RAG

Instead of stuffing entire documents into context, feed only units above your attention threshold. The model sees 1 unit instead of 10, with metadata explaining why it matters.

Inference

Structural Routing

heading_path gives your agent document topology without reading the whole thing. Route security units to a safety chain, financial units to an audit chain, informational units to /dev/null.

Inference

Zero-Cost Preprocessing

All of this runs locally in ~6ms per document. No API calls. No GPU. No tokens consumed. Structure your data before it ever touches a model.

Get started Read the docs Talk to us

Stop prompting.
Start decomposing.

Every text becomes structured intelligence.

Authority Detection

Risk Scoring

Entity Recognition

Irreducibility

Semantic Splitting

Attention Budget

One line. Zero config.

Available on ClawHub.

Works with any MCP-compatible agent.

Every unit is a training signal.

Curriculum Filtering

Instruction Tuning Pairs

Irreducibility Guardrails

Attention-Weighted RAG

Structural Routing

Zero-Cost Preprocessing

All intelligence begins with decomposition.

Stop prompting.Start decomposing.

Every text becomes structured intelligence.

Authority Detection

Risk Scoring

Entity Recognition

Irreducibility

Semantic Splitting

Attention Budget

One line. Zero config.

Available on ClawHub.

Works with any MCP-compatible agent.

Every unit is a training signal.

Curriculum Filtering

Instruction Tuning Pairs

Irreducibility Guardrails

Attention-Weighted RAG

Structural Routing

Zero-Cost Preprocessing

All intelligence begins with decomposition.

Stop prompting.
Start decomposing.