AI tools editing your code make mistakes because they have no context. CodeDNA embeds it directly in the files — like DNA in every cell. Zero RAG. Minimal drift.
Imagine asking a contractor to renovate your apartment without showing them the floor plan. Same problem with AI editing your code. CodeDNA is the floor plan.
Every AI coding agent relies on multiple memory layers. Most of them are external to the code. CodeDNA is the only layer that lives inside the source files themselves — it travels with every clone, fork, and CI pipeline.
| Layer | Examples | Where it lives | Shared across tools? |
|---|---|---|---|
| LLM / Agent | Claude, GPT-4, Cursor, Copilot | Cloud | — |
| External memory | Chat history, Memory API | Cloud / external DB | ✗ tool-specific |
| Markdown / Config | CLAUDE.md, .cursorrules, AGENTS.md | Repo (outside source files) | partial |
| CodeDNA | exports, rules, agent, message, .codedna | Inside every source file + repo root | always |
One command installs CodeDNA rules for any AI coding assistant. Or pick the file for your tool and paste it into your project.
bash <(curl -fsSL https://raw.githubusercontent.com/Larens94/codedna/main/integrations/install.sh) claude-hooks # or: cursor-hooks copilot-hooks cline-hooks opencode windsurf
pip install git+https://github.com/Larens94/codedna.git # set ANTHROPIC_API_KEY first codedna init ./ # first-time: annotates every .py file codedna update ./ # incremental: only unannotated files codedna check ./ # coverage report, no changes
.py write automatically..cursor/hooks/: validates on every file edit, reminds at session end. Requires v1.7+..github/hooks/: session start context, post-write validation, session end reminder..clinerules/hooks/: TaskStart context injection, PostToolUse validation. Requires v3.36+.AGENTS.md + JS plugin in .opencode/plugins/. Validates 11 languages on every write..windsurfrules to your project root. Cascade reads it automatically..agents/workflows/codedna.md to your project. Compatible with Antigravity and custom agent frameworks.Like biological DNA: cutting it in half produces two fragments that still carry the complete information. With CodeDNA, 10 random lines from anywhere in a file are enough for the AI to act correctly.
project: myapp
packages:
payments/:
purpose: "Invoice generation, Stripe integration"
analytics/:
purpose: "Revenue reports, KPI dashboards"
depends_on: [payments/, tenants/]
agent_sessions:
- agent: claude-sonnet-4-6
date: 2026-03-10
task: "Implement monthly revenue aggregation"
changed: [analytics/revenue.py]
"""pricing.py — Pricing engine with tier discounts.
exports: apply_discount(cents, tier) -> int
used_by: checkout.py → build_cart
rules: NEVER exceed MAX_DISCOUNT_RATE from config.py;
apply_discount() must cap before returning.
DB: discount_tiers(tier, multiplier).
"""
def apply_discount(cents: int, tier: str) -> int: """Apply tier discount to price in cents. Rules: MUST cap discount before returning — exceeding MAX_DISCOUNT_RATE is a financial compliance bug. After fix #42: also check tier != 'internal'. """ discount = get_multiplier(tier) discount = min(discount, MAX_DISCOUNT_RATE) return int(cents * (1.0 - discount))
# ❌ Ambiguous — euros? cents? price = request.json.get("price") data = get_users() # ✅ CodeDNA — type, domain, origin are clear int_cents_price_from_request = request.json.get("price") list_dict_users_from_db = get_users()
# 1. Read .codedna — project structure # 2. Read module docstring (8–12 lines each) # 3. Filter: used_by mentions target? Include # rules mentions task domain? Include # 4. Build exports → used_by graph # 5. Open in full ONLY the relevant files # Cost: ~50 tok × N files = complete map
message: — Agent-to-Agent Chat in CodeThe agent: field records what an agent did. The message: sub-field adds a conversational layer — soft observations and open questions left directly for the next agent, co-located with the code.
agent: claude-sonnet-4-6 | 2026-03-10 | Implemented monthly_revenue.
message: "rounding edge case in multi-currency
— not yet sure if this should be a rule"
agent: gemini-2.5-pro | 2026-03-18 | Added annual_summary.
message: "@prev: confirmed, promoted to rules:.
New: timezone rollover in January"
Lifecycle: a message: is either promoted to rules: (reply @prev: promoted to rules:) or dismissed (@prev: not applicable because...). Always append-only — never deleted.
Works at both levels: Level 1 (module docstring) for agents reading the full file, and Level 2 (function docstring) for agents using a sliding window that never sees the header.
Benchmark result: in the AgentHub multi-agent experiment (DeepSeek R1, 5 agents, 83 minutes), message: was adopted in 54 out of 55 files (98.2%) — spontaneously, with no mid-session reminders. Agents used it for handoff notes, per-function observations, and cross-file constraint propagation.
wiki: — LLM-Wiki Layer for your CodebaseThe wiki: opt-in field in a docstring points to a curated markdown page with deeper context. When present, a prior agent decided this file deserves notes beyond what the terse header can hold. The next agent reads it before editing. Two commands turn your annotations into a navigable knowledge graph.
"""revenue.py — Monthly revenue aggregation.
exports: monthly_revenue(year, month) -> dict
used_by: api/reports.py → revenue_route
related: billing/currency.py — shares multi-currency logic
wiki: docs/wiki/revenue.md
rules: get_invoices() returns ALL tenants
— MUST filter is_suspended() BEFORE sum
agent: claude-sonnet-4-6 | 2026-03-10 | ...
"""
Wiki layer in action
Obsidian graph — real project annotated with CodeDNA
5 scenarios that illustrate the categories of errors AI agents make without architectural context. For measured results, see the SWE-bench benchmark below.
used_by:, the AI only updates utils.py and leaves main.py with a runtime KeyError.price = 1999 — euros or cents? Without semantic naming the AI gets the unit wrong. With CodeDNA: int_cents_price_from_request — zero ambiguity.format_revenue() → format_currency(). The rules: field records the rename. The Control calls the old name: crash.exports: → used_by: graph it identifies exactly the 2 files.Django issues from SWE-bench, tested across multiple LLMs. Same prompt, same tools, same tasks. DeepSeek Chat: +17pp F1, p=0.001, 10/0/0 · Gemini 2.5 Flash: +13pp F1, p=0.040 · Gemini 2.5 Pro: +9pp F1. All 3 models improve.
Navigation Demo — django__django-11808 · DeepSeek Chat · 5 runs
Without CodeDNA: agent opens random files, stops early — 2/10 critical files found. |
With CodeDNA: follows used_by: chain — 6/10 critical files found.
🔬 Methodology: SWE-bench Django tasks × 3 models (Gemini 2.5 Flash ✓, DeepSeek Chat 10 tasks ✓, Gemini 2.5 Pro ✓). 3–5 runs/task at T=0.1. Identical system prompt, same 3 tools (read_file, list_files, grep), max 30 turns. Metric: File Localization F1 (ground-truth files from patch). Statistical test: Wilcoxon signed-rank (one-tailed). 6 DeepSeek tasks independently replicated by @fabioscialanga. Script: benchmark_agent/swebench/run_agent_multi.py.
| Approach | Token overhead | Context drift | Retrieval latency | Sliding-window | Infrastructure |
|---|---|---|---|---|---|
| CLAUDE.md / CursorRules | Low | Medium | Zero | No | External file |
| RAG / Vector DB | Low | Medium | High | No | DB + embedding |
| MemGPT | Medium | Low | Medium | No | Complex system |
| CodeDNA ✦ | Low (inline) | Low | Zero | Yes ✓ | None |
<type>_<shape>_<domain>_<origin>. Manifest-Only Planner Read.codedna init, codedna update, codedna check · pip installable · Claude Code Challenge: 7/7 patch files in ~8 min vs 6/7 in ~10–11 min (control). Results →M1–M5 are part of a funding application to NLnet NGI0 Commons Fund. If you find CodeDNA useful, ⭐ the repo and share it.