knowledge-context-protocol

No description provided.

Field	Value
GitHub	https://github.com/Cantara/knowledge-context-protocol
Language	Java
Stars	12
Last updated	2026-03-15

README

Knowledge Context Protocol (KCP)

A structured metadata standard that makes knowledge navigable by AI agents.

KCP is to knowledge what MCP is to tools.

→ Website · Read the spec · Read the proposal

The Model Context Protocol defines how agents connect to tools. KCP defines how knowledge is structured so those tools can serve it effectively. MCP solved the tool connectivity problem. KCP addresses the knowledge structure problem that remains.

In 30 Seconds

Drop a knowledge.yaml at the root of any project. Agents stop guessing and start navigating.

kcp_version: "0.10"
project: my-project
version: 1.0.0
units:
  - id: overview
    path: README.md
    intent: "What is this project and how do I get started?"
    scope: global
    audience: [human, agent]
    triggers: [overview, getting started, quickstart]

Validated result: 53–80% fewer agent tool calls vs unguided exploration (tested across 5 frameworks: crewAI, AutoGen, smolagents, LangChain, OpenCode).

Five minutes to Level 1. See adopting KCP in existing projects.

The Problem with llms.txt

Jeremy Howard's llms.txt solved a real problem: it gives AI agents a canonical, machine-readable index of a website. For personal sites and small documentation sets, it works well.

But llms.txt has six structural limitations that matter at scale:

Limitation	What it means
Flat topology	Lists what exists. Cannot express that A depends on B, or that C supersedes D.
No selective loading	All or nothing: the index (small) or the full dump (huge). No middle ground.
No intent metadata	A URL and a description. No way to say what task this knowledge is relevant to.
No freshness signal	Cannot distinguish a document written yesterday from one written three years ago.
No tooling connection	A static file. No query interface, no dependency graph, no retrieval integration.
Scale collapse	Works for 27 blog posts. Fails for 8,934 files across an enterprise.

These are structural limitations, not incidental ones. A bigger text file does not solve them.

What KCP Provides

KCP is a file format specification — a knowledge.yaml manifest you drop at the root of a project or documentation site. It adds the metadata layer that llms.txt cannot express:

Topology: what depends on what, what supersedes what
Intent: what task or question each knowledge unit answers
Freshness: when each unit was last validated, and against what
Selective loading: agents query by task context, not by URL guessing
Audience targeting: which units are for humans, which for agents, which for both

The Spec

Root Manifest: `knowledge.yaml`

project: <name>
version: 1.0.0
updated: <ISO date>
language: <BCP 47 tag>             # optional — default for all units
license: <SPDX identifier>        # optional — default for all units
indexing: open | read-only | no-train | none  # optional — default for all units
hints:                             # optional — manifest-level aggregate hints
  total_token_estimate: <integer>
  recommended_entry_point: <unit-id>
  has_summaries: true | false
trust:                             # optional — publisher provenance and audit
  provenance:
    publisher: <string>
    publisher_url: <string>
    contact: <string>
  audit:                           # optional — agent audit requirements
    agent_must_log: true | false
    require_trace_context: true | false
auth:                              # optional — authentication methods
  methods:
    - type: none | oauth2 | api_key
delegation:                        # optional — delegation constraints (v0.7)
  max_depth: 3                     # 0 = no delegation; absent = unlimited
  require_capability_attenuation: true
  audit_chain: true
  human_in_the_loop:               # object, not string
    required: true
    approval_mechanism: oauth_consent | uma | custom
    docs_url: "https://example.com/hitl-policy"
compliance:                        # optional — data governance (v0.7)
  data_residency: [EU]
  sensitivity: confidential        # public | internal | confidential | restricted
  regulations: [GDPR, NIS2]
  restrictions: [no_ai_training]
rate_limits:                       # optional — agent rate limiting (v0.8)
  default:
    requests_per_minute: 60
    requests_per_day: 10000
manifests:                         # optional — federation declarations (v0.9)
  - id: <local-identifier>
    url: "https://example.com/knowledge.yaml"
    label: <string>
    relationship: child | foundation | governs | peer | archive
    local_mirror: "./mirrors/local.yaml"
payment:                           # optional — default monetisation tier
  default_tier: free | metered | subscription

units:
  - id: <unique-identifier>
    path: <relative path to content file>
    kind: knowledge | schema | service | policy | executable  # optional; default: knowledge
    intent: "<What question does this answer?>"
    format: markdown | pdf | openapi | ...  # optional
    content_type: <MIME type>              # optional
    language: <BCP 47 tag>                 # optional
    scope: global | project | module
    audience: [human, agent, developer, operator, architect]
    license: <SPDX identifier or object>  # optional
    validated: <ISO date>
    update_frequency: hourly | daily | weekly | monthly | rarely | never  # optional
    indexing: open | read-only | no-train | none  # optional
    depends_on: [<unit-id>, ...]       # optional
    supersedes: <unit-id>              # optional
    triggers: [<keyword>, ...]         # optional
    hints:                             # optional — context window hints
      token_estimate: <integer>        # approximate token count
      load_strategy: eager | lazy | never  # when to load; default: lazy
      priority: critical | supplementary | reference  # eviction order; default: supplementary
      density: dense | standard | verbose  # information density; default: standard
      summary_available: true | false  # shorter version exists in this manifest
      summary_unit: <unit-id>          # id of the summary unit
      summary_of: <unit-id>            # id of the full unit this summarises
    access: public | authenticated | restricted  # optional; default: public
    auth_scope: <string>               # optional — opaque scope token for restricted units
    sensitivity: public | internal | confidential | restricted  # optional
    deprecated: true | false          # optional; default: false
    external_depends_on:               # optional — cross-manifest dependencies (v0.9)
      - manifest: <manifests-id>
        unit: <remote-unit-id>
        on_failure: skip | warn | degrade  # default: skip
    payment:                           # optional — override root default
      default_tier: free | metered | subscription

relationships:
  - from: <unit-id>
    to: <unit-id>
    type: enables | context | supersedes | contradicts | depends_on | governs

external_relationships:                # optional — cross-manifest relationships (v0.9)
  - from_manifest: <manifests-id>      # optional — omit = this manifest
    from_unit: <unit-id>
    to_manifest: <manifests-id>        # optional — omit = this manifest
    to_unit: <unit-id>
    type: enables | context | supersedes | contradicts | depends_on | governs

Knowledge Unit Fields

Field	Required	Description
`id`	yes	Unique identifier within the project
`path`	yes	Relative path to the content file
`kind`	optional	Type of artifact: `knowledge` (default), `schema`, `service`, `policy`, `executable`
`intent`	yes	One sentence: what question does this unit answer?
`format`	optional	Content format: `markdown`, `pdf`, `openapi`, `json-schema`, `jupyter`, etc.
`content_type`	optional	Full MIME type for precise format identification
`language`	optional	BCP 47 language tag (e.g. `en`, `no`, `de`)
`scope`	yes	`global`, `project`, or `module`
`audience`	yes	Who this is for: `human`, `agent`, `developer`, `operator`, `architect`
`license`	optional	SPDX identifier or structured license object
`validated`	recommended	ISO date when content was last confirmed accurate
`update_frequency`	optional	How often content changes: `hourly`, `daily`, `weekly`, `monthly`, `rarely`, `never`
`indexing`	optional	AI crawling permissions: `open`, `read-only`, `no-train`, `none`, or structured object
`depends_on`	optional	Units that must be understood before this one
`supersedes`	optional	The unit-id this replaces
`triggers`	optional	Task contexts or keywords that make this unit relevant
`hints`	optional	Advisory context window hints: `token_estimate`, `load_strategy`, `priority`, `density`, `summary_available`, `summary_unit`, `summary_of`
`access`	optional	Who can fetch this unit: `public` (default), `authenticated`, `restricted`
`auth_scope`	optional	Opaque scope token indicating which credential scope is needed (meaningful when `access` is `restricted`)
`sensitivity`	optional	Information classification: `public`, `internal`, `confidential`, `restricted`
`deprecated`	optional	If `true`, this unit is present but should not be used for new development
`delegation`	optional	Per-unit delegation override: `max_depth`, `require_capability_attenuation`, `audit_chain`, `human_in_the_loop`
`compliance`	optional	Per-unit compliance override: `data_residency`, `sensitivity`, `regulations`, `restrictions`
`rate_limits`	optional	Rate-limit metadata: `default.requests_per_minute`, `default.requests_per_day` (root-level default; per-unit override)
`payment`	optional	Monetisation tier: `default_tier: free \\| metered \\| subscription`

Minimum Viable KCP

Five fields per unit are enough to start:

kcp_version: "0.10"
project: my-project
version: 1.0.0
units:
  - id: overview
    path: README.md
    intent: "What is this project and how do I get started?"
    scope: global
    audience: [human, agent]

The standard allows complexity but does not demand it.

Complete Example

# knowledge.yaml
kcp_version: "0.10"
project: wiki.example.org
version: 1.0.0
updated: "2026-02-28"
language: en
license: "Apache-2.0"
indexing: open

units:
  - id: about
    path: about.md
    intent: "Who maintains this project? Background, current work, contact."
    scope: global
    audience: [human, agent]
    validated: "2026-02-24"
    update_frequency: monthly

  - id: methodology
    path: methodology/overview.md
    intent: "What development methodology is used? Principles, evidence, adoption."
    format: markdown
    scope: global
    audience: [developer, architect, agent]
    depends_on: [about]
    validated: "2026-02-13"
    triggers: ["methodology", "productivity", "workflow"]

  - id: api-spec
    kind: schema
    path: api/openapi.yaml
    intent: "What endpoints does the API expose?"
    format: openapi
    scope: module
    audience: [developer, agent]
    validated: "2026-02-25"
    update_frequency: weekly

  - id: knowledge-infrastructure
    path: tools/knowledge-infra.md
    intent: "How is knowledge infrastructure set up? Architecture, indexing, deployment."
    scope: global
    audience: [developer, devops, agent]
    depends_on: [methodology]
    validated: "2026-02-28"
    supersedes: knowledge-infra-v1
    triggers: ["knowledge infrastructure", "MCP", "code search", "indexing"]

relationships:
  - from: methodology
    to: knowledge-infrastructure
    type: enables
  - from: about
    to: methodology
    type: context

Adoption Gradient

KCP is designed to be adopted incrementally.

Level 1 — Personal sites and small projects Drop a knowledge.yaml alongside your llms.txt. Add id, path, and intent for your key pages. Five minutes. Immediately navigable by agents.

Level 2 — Open source projects Add depends_on, validated, and hints (at minimum token_estimate, load_strategy, and the summary_available / summary_unit / summary_of trio). Agents can now load documentation in dependency order, check freshness before acting on it, and prefer short TL;DR files over full documents when answering common questions.

Level 3 — Enterprise documentation Use the full field set including triggers, audience, relationships, and advanced hints (priority, density, chunking). Build knowledge-graph-navigable documentation that supports multiple agent roles querying the same corpus with different task contexts and constrained context budgets.

Level 4 — Multi-agent systems Add auth, delegation, and compliance blocks. Pair with an A2A Agent Card (/.well-known/agent.json) and link to KCP via the knowledgeManifest convention. Enforce delegation depth, capability attenuation, HITL gates, and data residency constraints across agent chains.

Relationship to HATEOAS

KCP shares a foundational insight with HATEOAS (Hypermedia As The Engine Of Application State): typed, directional relationships between resources are necessary for navigation — a flat list is not enough. KCP's depends_on, supersedes, and relationships fields are the same idea as HATEOAS link relations.

The key difference is static vs dynamic. HATEOAS links are generated per-response based on live resource state. KCP is a committed file: it declares topology at authoring time without a server. Where KCP goes further: the intent field (what question does this unit answer?), validated (human-confirmed freshness, not just file modification time), and audience targeting — concerns that arise specifically when the consumer is a context-window-constrained AI agent rather than an API client.

See SPEC.md §11 for a full treatment.

Relationship to MCP and Synthesis

MCP (Model Context Protocol) defines how agents connect to tools. KCP defines how knowledge is structured for those tools to serve.

Synthesis is a knowledge infrastructure tool and reference implementation of a KCP-native knowledge server. It indexes workspaces — code, documentation, configuration, PDFs — and serves them via MCP with sub-second retrieval. KCP is the format specification; Synthesis is one engine that implements it.

synthesis export --format kcp will generate a knowledge.yaml from an existing Synthesis index automatically.

A2A (Agent-to-Agent, Google) defines how agents discover and invoke each other via /.well-known/agent.json Agent Cards. A2A is the transport/invocation layer (per-agent granularity). KCP is the knowledge-access layer (per-unit granularity). They are complementary: an Agent Card describes what an agent can do; a KCP manifest describes what it knows. See SPEC.md §12 and examples/a2a-agent-card/.

kcp-commands is a KCP-native Claude Code hook that applies the KCP principle at the Bash tool boundary. Each manifest is a knowledge.yaml-compatible description of a CLI command: syntax hints injected before execution (Phase A), noise-filtered output after execution (Phase B). 283 manifests bundled; unknown commands auto-generate manifests from --help output. Measured saving: 67,352 tokens per session — 33.7% of a 200K context window recovered.

opencode-kcp-plugin is a KCP-native plugin for OpenCode. It injects the knowledge.yaml knowledge map into every session's system prompt and annotates glob/grep results with KCP intent strings — reducing explore-agent tool calls by 73–80%. Install: npm install opencode-kcp-plugin and add "plugin": ["opencode-kcp-plugin"] to opencode.json. Source: plugins/opencode/

kcp-memory is the episodic memory layer for Claude Code. Indexes ~/.claude/projects/**/*.jsonl session transcripts and ~/.kcp/events.jsonl tool-call events into a local SQLite+FTS5 database. Three-layer memory model: working (context window) → episodic (kcp-memory) → semantic (Synthesis). Runs as a daemon (port 7735), CLI, and MCP server (6 tools: kcp_memory_search, kcp_memory_events_search, kcp_memory_list, kcp_memory_stats, kcp_memory_session_detail, kcp_memory_project_context). PostToolUse hook for near-real-time indexing. v0.4.0 — proactive session-start context via PWD detection. Install: curl -fsSL https://raw.githubusercontent.com/Cantara/kcp-memory/main/bin/install.sh | bash

Blog series

The full design rationale, benchmarks, and adoption walkthroughs are documented at wiki.totto.org in the Knowledge Context Protocol series:

Post	Key content
Beyond llms.txt: AI Agents Need Maps, Not Tables of Contents	Why llms.txt has six structural limits; KCP proposal
Add knowledge.yaml to Your Project in Five Minutes	Adoption gradient Level 1–3 with field reference
KCP on Two Repos, Two Days	74% + 53% tool-call reduction, benchmark methodology
KCP on Three Agent Frameworks	AutoGen 80%, CrewAI 76%, smolagents 73%
kcp-commands: Save 33% of Context Window	Phase A/B/C design, 283 manifests, 67K tokens saved
KCP Comes to OpenCode	opencode-kcp-plugin: system prompt injection + glob annotation
kcp-memory: Give Claude Code a Memory	Three-layer memory model, MCP server, 6 tools
The Front Door and the Filing Cabinet: A2A Agent Cards Meet KCP	A2A + KCP composability; 4 simulators, 150 adversarial tests; 8 spec gaps fed into v0.7+

Governance

KCP has been submitted to the Agentic AI Foundation (Linux Foundation, launched December 2025) for consideration as a neutral-governance project alongside MCP and AGENTS.md. The AAIF brings together 146 member organizations — including AWS, Anthropic, Google, Microsoft, and OpenAI — under neutral governance for agentic infrastructure standards.

Until formal acceptance, KCP remains an Apache 2.0 open specification proposed by eXOReaction AS.

Status

Current: Draft specification — v0.10

This is an early proposal. The format is intentionally minimal. Feedback, use cases, and pull requests are welcome.

SPEC.md — Normative specification (field definitions, validation rules, conformance levels)
PROPOSAL.md — The case for a knowledge architecture standard
RFC-0001 — Extended capabilities (overview of all proposals; F/H/I/J/K/L/N promoted to v0.3–v0.4 core)
RFC-0002 — Auth and delegation metadata (access, auth_scope, auth promoted to core in v0.5–v0.6; delegation promoted to core in v0.7)
RFC-0003 — Cross-manifest federation proposal (manifests, external_depends_on, external_relationships — promoted to core in v0.9 as DAG with local authority)
RFC-0004 — Trust, provenance, and compliance metadata (trust.provenance, sensitivity promoted in v0.5; trust.audit promoted in v0.6; compliance promoted to core in v0.7)
RFC-0005 — Payment and rate-limit metadata proposal (payment.default_tier promoted to core in v0.5; rate_limits promoted to core in v0.8; payment methods and x402 remain RFC)
RFC-0006 — Context window hints (accepted; promoted to SPEC.md §4.10 in v0.4)
parsers/ — Reference parser/validator implementations (Python, Java) — 401 tests passing
bridge/ — MCP servers: expose any knowledge.yaml as MCP resources (TypeScript · Python · Java). The TypeScript parser, validator, and mapper live in bridge/typescript/src/ (parser.ts, validator.ts, mapper.ts).
plugins/opencode/ — OpenCode plugin (opencode-kcp-plugin on npm)
examples/ — Reference manifests at four adoption levels plus 4 simulation scenarios (150 adversarial tests: A2A+KCP clinical research, energy metering HITL, legal delegation chains, financial AML)
kcp-memory — Episodic memory daemon for Claude Code (separate repo)

Early Adopters

KCP is being used in production by practitioners who independently discovered the same problems it solves.

"A Norwegian public sector architect independently built a full KCP adoption framework — reference-quality knowledge.yaml, transition guide, snapshot-awareness pattern, LLM client protocol — and arrived at '2-4 files instead of 15-20' on his own."

If you are using KCP and want to be listed here, open an issue or reach out.

Contributing

Open an issue to: - Propose additions to the field set - Share a use case that the current spec does not cover - Report a gap or ambiguity in the format

The goal is a standard that solves the real problem without demanding complexity from those who do not need it.

License

Apache V2.

Proposed by eXOReaction AS — builders of Synthesis, based in Oslo, Norway.