Pillar: new-tooling-2026 | Date: April 2026
Scope: New MCPs released since Jan 2026 worth evaluating: design system MCPs, accessibility MCPs (axe-core MCP, Pa11y), observability MCPs, vector store MCPs, agentic testing MCPs, browser automation beyond Playwright. New skills/commands ecosystems (community plugins, Anthropic official skill registry). New hook patterns published (PreToolUse/PostToolUse/Stop/SessionStart innovations). Anthropic harness team roadmap signals for next 1-2 quarters. What Jacob should NOT reinvent because it is shipping.
Sources: 63 gathered, consolidated, synthesized.
Three distinct accessibility MCP solutions are now available for Claude Code: Deque's enterprise Axe MCP Server (GA February 11, 2026), a community open-source a11y-mcp (40 stars), and the Community-Access/accessibility-agents bundle (24 tools, 11 specialist agents).[3][36][54] All axe-core-based tools share the same fundamental ceiling: automated scanning detects only 30–40% of accessibility issues; manual audits remain required for complete WCAG coverage.[25]
Launched at Axe-con 2025, GA February 11, 2026, included at no additional cost in Axe DevTools for Web licenses.[3][35] Available at https://github.com/dequelabs/axe-mcp-server-public. Explicitly confirmed for: GitHub Copilot, Cursor, Claude Code, VS Code.[3]
Core workflow: analyze → remediate → validate, with configurable auto-validation of fixes. Connects to Deque University's WCAG-aligned knowledge base. CTO Dylan Barrell: "Developers can contribute to accessibility earlier in the software development lifecycle, all while using existing tools."[3]
Roadmap: expanding to power AI agents across the full dev lifecycle; automated intelligent guided tests; advanced rules for increased issue detection.[3]
Key finding: The Axe MCP Server is the only accessibility tool with confirmed Claude Code support and enterprise backing — but its 30–40% automated detection ceiling means it is a triage accelerator, not a compliance replacement.[25]
| Tool | Engine | Stars / Status | Cost | Claude Code | CI/CD |
|---|---|---|---|---|---|
| Deque Axe MCP Server[3] | Axe Platform (enterprise) | GA (Feb 11, 2026) | Included in Axe DevTools | Confirmed | Yes |
| a11y-mcp (community)[36] | axe-core (OSS) | 40 stars, 8 forks | Free / MPL-2.0 | Not confirmed | Yes |
| Community-Access agents[54] | axe-core + 24 tools | 11 specialist agents | Free / OSS | Confirmed | Yes |
| Pa11y / pa11y-ci[25] | HTML CodeSniffer | No MCP server | Free / OSS | CLI only | Yes (CLI) |
No dedicated Pa11y MCP server exists in 2026. Pa11y remains a CLI tool for batch URL testing via pa11y-ci, best integrated as a GitHub Actions step or hook script.[25] The community a11y-mcp uses axe-core, not Pa11y. MCP skills on mcpmarket.com covering accessibility (e.g., "Accessibility Testing Skill for Claude Code | axe-core") also use axe-core exclusively.[25]
The design system MCP space consolidated around three production-grade offerings in Q1–Q2 2026: Figma's official MCP server (production since June 2025, expanded February 17, 2026), a Figma-native "Create Design System Rules" skill, and Storybook's MCP server (v10.3+, React only). Together they enable a full code→Figma→code roundtrip.[4][27][38]
Figma MCP Server launched in Dev Mode in June 2025. On February 17, 2026, a new code-to-Figma direction was added, completing the full roundtrip.[4][37][55] Claude Code is listed as an official Figma MCP partner.
| Direction | Key Tool | What it does | Available Since |
|---|---|---|---|
| Figma → Code[55] | (MCP context) | Extract design tokens, colors, spacing, typography; pixel positions → layout relationships; raw hex → token references; nested layers → flat dev structure | June 2025 |
| Code → Figma[37] | generate_figma_design |
Capture functioning UI from browser (prod/staging/localhost), convert to editable Figma frames with multi-screen flow context | Feb 17, 2026 |
| Canvas manipulation[37] | use_figma |
Direct Figma canvas control from Claude Code | Late March 2026 |
Full roundtrip: Claude Code generates code → code runs in browser → screenshot captured → Figma frame created → designers iterate → Figma MCP feeds updated specs back to Claude Code.[4]
Code Connect: Links existing codebase components to Figma designs, enabling side-by-side comparisons, annotation, and variant exploration without re-coding.[55]
Community extension: southleft/figma-console-mcp — "Your design system as an API. Connect AI to Figma for extraction, creation, and debugging."[4]
Key finding: The Figma MCP's February 2026 code-to-Figma direction completes a bidirectional loop — development artifacts can now flow into design tools without manual redraw, eliminating the traditional dev→designer handoff bottleneck.[4]
Official Figma Developer Docs skill generating custom design system rules that guide AI coding agents to produce consistent code when implementing Figma designs.[38] Output targets: CLAUDE.md for Claude Code, .cursor/rules/figma-design-system.mdc for Cursor, AGENTS.md for Codex CLI.[38]
5-step workflow: (1) Run tool with language/framework parameters → (2) Analyze codebase for component organization, styling, patterns → (3) Generate rules covering components, styling, Figma integration, assets, conventions → (4) Save to agent config → (5) Validate and iterate with test implementations.[38]
Critical rules use "IMPORTANT:" prefix for conventions that must never be violated (e.g., "Never hardcode colors — use tokens from [location]").[38]
Storybook's official MCP server ships at http://localhost:6006/mcp in v10.3+. Connects AI agents to Storybook instances to understand components, generate stories, run tests, validate UI consistency.[27] Current limitation: React projects only — Vue, Angular, Web Components, Svelte support planned.[27]
| Tool | Purpose |
|---|---|
get-storybook-story-instructions[27] |
Story and interaction test guidance |
preview-stories[27] |
Renders story previews in chat |
list-all-documentation[27] |
Returns full component index |
get-documentation[27] |
Component props and sample stories |
get-documentation-for-story[27] |
Full story code + docs |
run-story-tests[27] |
Executes tests, accessibility checks, reports results |
Dual value: Storybook MCP is simultaneously a design system MCP (component discovery, documentation) and an agentic testing MCP (run-story-tests + accessibility checks).[27]
See also: Agentic Testing pillar for Storybook's run-story-tests integration in CI pipelines
Claude Code now ships with built-in OTLP telemetry — no external wrapper required. Five additional observability MCPs are available for querying external systems: Datadog (preview), Grafana (v0.12.0, April 23, 2026), Traceloop OTel MCP (v0.2.2, 185 stars), community claude_telemetry wrapper, and New Relic's agentic platform (February 24, 2026).[29][19][28][39]
Enable with CLAUDE_CODE_ENABLE_TELEMETRY=1. Three independent signal types, all OTLP-standard.[29] Improved in v2.1.117 (April 22, 2026) with user_prompt and command_name events added to traces.[9]
| Signal | Contents | Enable With | Status |
|---|---|---|---|
| Metrics[29] | Counters for tokens, cost, sessions, lines of code, tool decisions | OTEL_METRICS_EXPORTER |
Stable |
| Log events[29] | Structured records for each prompt, API request, error, tool result | OTEL_LOGS_EXPORTER |
Stable |
| Traces[29] | Spans for interaction, model request, tool call, hook | OTEL_TRACES_EXPORTER + CLAUDE_CODE_ENHANCED_TELEMETRY_BETA=1 |
Beta |
| Span | Description |
|---|---|
claude_code.interaction[29] | Single turn (top level) |
claude_code.llm_request[29] | Each API call — model, latency, tokens |
claude_code.tool[29] | Each tool invocation |
claude_code.tool.blocked_on_user[29] | Permission wait time |
claude_code.tool.execution[29] | Actual tool execution |
claude_code.hook[29] | Hook execution (beta flag required) |
Privacy defaults: Prompt and content NOT recorded by default. Opt-in via: OTEL_LOG_USER_PROMPTS, OTEL_LOG_TOOL_DETAILS, OTEL_LOG_TOOL_CONTENT (truncated at 60KB).[29] Subagent spans nest under parent claude_code.tool — full delegation chain visible in one trace. W3C trace context propagated into CLI subprocess and forwarded to Bash commands.[29]
Supported backends: Honeycomb, Datadog, Grafana, Langfuse, self-hosted OTLP collector.[29]
Key finding: Claude Code's native OTLP removes any need to build a separate observability wrapper — the instrumentation ships in the binary. The remaining gap is on the query/alert side, which Datadog MCP and Grafana MCP address.[29][5][19]
Announced at Datadog DASH conference (June 2025), now in preview.[39] Partners: Claude Code (confirmed), Cursor, OpenAI Codex CLI (partnership announced), Block's Goose.[5][39] Supports HIPAA-compliant environments with user-based RBAC.[5]
| Tool | Purpose |
|---|---|
get_logs, list_spans, get_trace[39] | Live log and trace retrieval |
list_metrics, get_metrics, get_monitors[39] | Metrics and monitor status |
list_incidents, get_incident[39] | Incident management |
list_dashboards[39] | Dashboard discovery |
list_hosts[39] | Host inventory |
Official Grafana tool at grafana/mcp-grafana. 2,900 stars, 346 forks. v0.12.0 released April 23, 2026. Go (94.3%), Apache 2.0.[19]
Tools by category: Dashboard operations (search, retrieve, patch, extract panel queries via JSONPath); Prometheus (PromQL instant/range queries, metric metadata, label discovery, histogram percentiles); Loki (LogQL queries, log patterns, label metadata); Alert rules; Grafana Incident creation/tracking; Sift investigations (error patterns, slow requests); OnCall (schedule/shift management); Rendering (Dashboard/panel PNG export); Deeplink generation.[19]
Context-aware design: get dashboard summary and get dashboard property (JSONPath) tools minimize token cost vs. full dashboard JSON retrieval.[19]
GitHub: traceloop/opentelemetry-mcp-server. Version 0.2.2 (February 8, 2026). 185 stars, 15 forks. Apache 2.0. Unified MCP server for querying OTel traces across Jaeger, Grafana Tempo, and Traceloop Cloud.[28]
LLM-specialized tools with OpenLLMetry semantic convention support:[28] get_llm_usage (aggregate token usage across services/models), get_llm_model_stats (compare model performance), get_llm_expensive_traces (highest token-usage operations), get_llm_slow_traces (performance bottlenecks). Complementary to Datadog/Grafana for pure OTel trace querying.[28]
claude_telemetry (TechNickAI/claude_telemetry): Drop-in replacement swapping claude command for claudia. Logs tool calls, token usage, costs, execution traces. Backends: Logfire, Sentry, Honeycomb, Datadog. Alternative to full OTLP stack configuration.[56]
New Relic Agentic Platform (February 24, 2026): No-code agentic platform for data observability agents. MCP support for connecting AI applications to data sources. Fleet management for OTel collectors. Focused on observability-persona agents, not general-purpose.[5]
See also: Cost Optimization pillar for token/cost metrics analysis from OTLP dataFour vector store MCPs are in active use in 2026: Zilliz claude-context (9,900+ stars — dominant for semantic code search), Qdrant official MCP (1,400 stars — semantic memory layer), Pinecone Plugin (official, February 11, 2026 — RAG + assistant), and bobmatnyc/mcp-vector-search (ChromaDB-backed CLI-first search).[42][6][30][57]
GitHub: zilliztech/claude-context. 9,900+ stars, 751 forks — highest adoption of any semantic code MCP.[42] Benchmarked ~40% token reduction vs. traditional context loading. Handles millions of lines of code (enterprise-scale). Supports 15+ AI coding tools with Claude Code as primary.[42]
Architecture: Hybrid search (BM25 keyword + dense vector embeddings), AST-based code chunking for 8+ languages (TypeScript, Python, Java, C++, C#, Go, Rust, etc.), incremental indexing via Merkle trees.[42] Embedding providers: OpenAI (text-embedding-3-small/large), VoyageAI (voyage-code-3), Ollama (local), Google Gemini.[42]
Setup: claude mcp add claude-context — requires free Zilliz Cloud account + embedding API key.[42]
Community fork: danielbowne/claude-context — local LanceDB backend, no cloud required.[57]
Contextual chunking pattern (from Anthropic research, applied by both Zilliz and bobmatnyc): Prepending compact metadata header to each chunk before embedding reduces retrieval failures by 35–49%.[57]
Key finding: Modern coding agents (Claude Code, Cursor, Devin) are NOT primarily using vector RAG for code retrieval — they use file system navigation, grep, and AST parsing. Vector stores shine for knowledge bases and documentation where explicit structural relationships are absent. The 9,900-star adoption of claude-context suggests vector search adds value primarily when codebases exceed context window capacity.[6][42]
| Tool | Stars | Backend | Primary Use Case | Cloud Required | Install |
|---|---|---|---|---|---|
| Zilliz claude-context[42] | 9,900+ | Milvus / Zilliz Cloud | Semantic code search, 40% token reduction | Optional (fork = local) | claude mcp add claude-context |
| Qdrant official[6] | 1,400 | Qdrant (local or cloud) | Semantic memory, knowledge base, docs retrieval | No (QDRANT_LOCAL_PATH) |
uvx / Docker / Smithery |
| Pinecone Plugin[30] | Official plugin | Pinecone | RAG, Assistant (multi-format docs), semantic search | Yes (API key required) | claude plugin install pinecone |
| bobmatnyc mcp-vector-search[57] | Community | ChromaDB | CLI-first semantic code search | No | CLI + MCP |
qdrant/mcp-server-qdrant, v0.8.1 (December 2025), 1,400 stars, 268 forks, Apache 2.0.[6] Two core tools: qdrant-store (store information with metadata to collections) and qdrant-find (semantic similarity search, up to 10 results by default).[6] Default embedding: FastEmbed with sentence-transformers/all-MiniLM-L6-v2. Transport options: stdio, SSE, streamable HTTP.[41] Positioned as semantic MEMORY layer — distinct from code navigation use case.[6]
Published via Anthropic's Claude Code Plugin Marketplace, February 11, 2026.[30] Install: claude plugin install pinecone. Slash commands: /pinecone:query, /pinecone:assistant-create, /pinecone:assistant-upload, /pinecone:assistant-sync, /pinecone:assistant-chat.[30] Pinecone Assistant supports upload of PDF, Markdown, TXT, DOCX, JSON files with cited answers and page references. Alternative MCP approach via npx @pinecone-database/mcp with cascading-search (multi-index with deduplication and reranking).[30]
Five browser automation MCPs are production-ready in 2026, with Agent-Browser (Vercel Labs, 29,500+ stars) and Chrome DevTools MCP (Google official, 37,400 stars) as the major 2026 additions. Token efficiency ranges from 114K tokens/task (Playwright MCP) to 7,800 tokens/task (Agent-Browser) — a 14.7× spread that determines which tool is viable in constrained contexts.[10][59]
Key finding: Industry consensus for 2026: "Start with Playwright MCP as default; choose Browserbase for cloud scale, mcp-chrome for working inside an already-logged-in browser, Browser Use for persistent agent sessions, and Chrome DevTools MCP for debugging and performance audits."[31]
| Tool | Tokens per Task | vs Playwright MCP | Best For |
|---|---|---|---|
| Playwright MCP (Microsoft)[1] | ~114,000 | Baseline | CI/CD, cross-browser, standard default |
Playwright CLI (@playwright/cli)[59] |
~27,000 | 4× reduction | Context-constrained CI environments |
| Agent-Browser (Vercel Labs)[59] | ~7,800 | 14.7× reduction | Token-constrained, high test volume |
Maintained by Google's ChromeDevTools team. 37,400 stars, 2,300 forks. v0.23.0, April 22, 2026. TypeScript (95.3%), Node.js 20.19+.[10]
34 tools across 7 categories: Input automation (9 tools), Navigation (6), Emulation (2), Performance (3 — trace recording and analysis), Network (2), Debugging (6 — script execution, console, Lighthouse audits, screenshots), Extensions (5), Memory (1).[10]
Installation: Plugin (MCP + Skills bundled) or MCP-only (claude mcp add chrome-devtools). "Slim mode" with just 3 tools for basic tasks.[10]
Positioning: Specialized for QA, performance audits, debugging — NOT general automation. Best when you need Lighthouse, network inspection, or performance tracing.[10]
29,500+ GitHub stars — most-starred browser tool in 2026.[10] Written in Rust for speed and minimal overhead. Innovation: generates accessibility tree, assigns @refs to interactive elements.[47]
"Ralph Wiggum Loop" Pattern (Vercel/Pulumi, 2026): Self-verifying agent — make change → take snapshot → evaluate vs expected state → if wrong, correct and loop. No human in loop.[59]
Complete rewrite with AI-native architecture. Talks directly to browser via Chrome DevTools Protocol (CDP). 44% faster than v2. Multi-language support, driver-agnostic (not tied to Playwright). Three primitives: act (perform action), extract (pull data), observe (understand screen).[10]
| Tool | Stars | Best For | Key Limitation | Cost |
|---|---|---|---|---|
| Chrome DevTools MCP[10] | 37,400 | Debugging, Lighthouse, performance | Not general automation | Free |
| Agent-Browser[10] | 29,500+ | Token-constrained test suites | Shadow DOM blind spot | Free |
| Playwright MCP[1] | Standard | CI/CD, cross-browser standard | 114K tokens/page | Free |
| Browserbase + Stagehand[10] | — | Cloud scale, anti-bot bypass | Paid subscription required | Paid |
| Browser Use[31] | — | Persistent profiles, long-running sessions | Local + cloud modes (complexity) | Free / Paid |
| mcp-chrome[31] | — | Working inside already-logged-in browser | Chrome/Chromium only, manual extension | Free |
| AgentQL[10] | — | Structured data extraction from dynamic pages | Query language learning curve | Paid |
Shadow DOM limitation (2026): Modern design systems (Shoelace, Lit) hide elements in shadow roots. Accessibility tree snapshots — used by Playwright MCP, Agent-Browser, and Chrome DevTools MCP — cannot see them.[47]
See also: UI Feedback Loop pillar for deep Playwright/Chrome DevTools integration patterns; Autonomous Build Loop pillar for browser automation in self-verifying agent loopsThe Claude Code plugin ecosystem reached 4,200+ skills, 770+ MCP servers, and 2,500+ marketplaces by April 2026, with 110,000+ monthly visitors to the official plugin directory.[58] The official anthropics/skills repo reached 125,000 stars with 14,600 forks — the most-starred AI tooling repository in the Claude ecosystem.[22]
GitHub: anthropics/claude-plugins-official. 18,000 stars, 2,200 forks, 310 commits.[7] Languages: Python (31.6%), TypeScript (28.9%), HTML (19.5%), Shell (13%), JS (7%).[7]
Plugin hierarchy: Skill (one SKILL.md file) → Plugin (plugin.json + skills/ + commands/ + agents/ + .mcp.json) → Marketplace (GitHub repo as plugin registry).[44]
New plugin capabilities in 2026:[44]
.lsp.json): Real-time code intelligence — TypeScript, Python, Rust official plugins availablemonitors/monitors.json): Watches logs/files, delivers stdout as Claude notificationssettings.json in plugin root): Can set default agent (e.g., "agent": "security-reviewer")bin/ on PATH (v2.1.91): Executables in bin/ added to Bash tool's PATH while plugin activeVerification model: External plugins submit via form. "Anthropic Verified" badge for additional review. Security checks: credential exfiltration, destructive commands, pipe-to-shell patterns. Anthropic does not warrant third-party plugin behavior.[7][43]
Key finding: The frontend-design skill reached 277,000+ installs by March 2026 — the most-installed single skill — demonstrating that design system enforcement via SKILL.md is the highest-demand use case in the ecosystem.[22]
GitHub: anthropics/skills. 125,000 stars, 14,600 forks. Actively maintained.[22]
"Skills 2.0" evolution: Skills now bundle instructions + scripts + templates + reference materials — a significant jump from text-only SKILL.md files. Community term: "Skills 2.0."[22]
Notable skill categories: Document skills (docx, pdf, pptx, xlsx — production-grade, ships in Claude.ai); Development (testing web apps, MCP server generation); Creative (art, music, design); Enterprise (communications, branding).[22]
Published June 26, 2025 (initial .dxt format). September 11, 2025: migrated to .mcpb (MCP Bundle) — legacy files still functional.[17]
Format: ZIP archive with manifest.json + server/ + dependencies/ + icon.png. Server types: "node", "python", "binary".[17] Claude Desktop includes built-in Node.js runtime. Python servers bundle own lib/ directory. OS keychain storage for API keys. Automatic updates via curated directory. Enterprise controls: Group Policy/MDM, extension blocklists, private directories, pre-installation approval.[17]
GitHub: metatool-ai/metamcp. 2,200 stars. Docker-based. MetaMCP is itself an MCP server — plugs into any MCP client.[23]
Problem solved: 7 active MCP servers consume ~67,300 tokens at session start — over one-third of a 200K context budget. Lazy loading reduces context usage by up to 95%.[34] MetaMCP provides the gateway to manage this via namespaces, tool picking, and multiple workspaces to prevent context pollution.[23]
7 active MCP servers = ~67,300 tokens consumed at session start, exceeding one-third of a 200K context budget.[34] Proposed lazy loading architecture from the community: lazy flag in settings.json per server; Skills/agents declare required MCPs in frontmatter (mcp: { required: [postgres], optional: [redis] }); MCPs load only when specific agents execute — not into main conversation thread. Proposed 95% context reduction.[34]
Already shipping: Deferred tool schemas (ToolSearch capability) within sessions — schema-on-demand loading is functional. Full per-server lazy loading is an emerging pattern that may ship as official Anthropic feature.[34]
| Plugin | Stars | Function | Source |
|---|---|---|---|
| claude-mem | 65,800+ | Persistent SQLite memory with context compression | [48] |
| vibe-kanban | 23,200+ | Kanban orchestration for 10+ coding agents simultaneously | [48] |
| ccusage | 11,500+ | Local usage analytics via JSONL parsing | [48] |
| CCHub (Tauri v2 + React + Rust) | — | Desktop app: MCP marketplace, config profiles, skills browser, security audit | [11] |
| codebase-graph | — | Knowledge graph from source via 42-language tree-sitter AST + FalkorDB | [11] |
| maestro-orchestrate | — | Multi-agent orchestration coordination | [11] |
| production-grade | — | 14-agent autonomous pipeline (PM through SRE) | [48] |
Third-party marketplaces: SkillsMP.com (universal SKILL.md format, community ratings), SkillKit (400,000+ skills).[48]
See also: Context Continuity pillar for vector-store memory plugin patternsThe hook system expanded from 12 documented events to 32 events in official docs by April 2026 — a 167% increase in the event surface area.[8][53] Five handler types are now available, including three new types added in Q1 2026: HTTP, MCP Tool (v2.1.118), and Agent (experimental).[8]
| Category | Events | Count |
|---|---|---|
| Session-level[8] | SessionStart, SessionEnd, InstructionsLoaded | 3 |
| Per-turn[8] | UserPromptSubmit, UserPromptExpansion, Stop, StopFailure, Notification, PreCompact, PostCompact | 7 |
| Agentic loop[8] | PreToolUse, PermissionRequest, PermissionDenied, PostToolUse, PostToolUseFailure, PostToolBatch | 6 |
| Agent/team[8] | SubagentStart, SubagentStop, TeammateIdle, TaskCreated, TaskCompleted | 5 |
| Config/environment[8] | ConfigChange, CwdChanged, FileChanged, WorktreeCreate, WorktreeRemove | 5 (counted as 4 in some docs) |
| MCP/Elicitation[8] | Elicitation, ElicitationResult | 2 |
Source discrepancy note: Community guides (claudefa.st, Pixelmojo) list 12 events — these are older/partial references. Official docs (raw_8.md, raw_53.md) document 27–32 events as of April 2026. Always use official docs.[8][45][52]
| Type | How It Works | Block Mechanism | Added |
|---|---|---|---|
Command ("command")[8] |
Shell scripts with JSON stdin/stdout | Exit code 2 | Original |
HTTP ("http")[8] |
POST to remote endpoints | 2xx with decision: 'block' |
~Feb 2026 |
MCP Tool ("mcp_tool")[9] |
Call tools on connected MCP servers | Via MCP tool response | v2.1.118 (Apr 2026) |
Prompt ("prompt")[8] |
LLM-based evaluation (model: "fast-model") |
JSON yes/no decision | Q1 2026 |
Agent ("agent")[8] |
Spawns subagent with Read, Grep, Glob tools | Agent decision | EXPERIMENTAL |
Exit code semantics (IMPORTANT): Exit code 0 = success (allows action); Exit code 2 = blocking error (blocks action, stderr to Claude); Any other code = non-blocking (shows error, continues). Exit code 1 is NON-BLOCKING.[8]
| Innovation | Version | What It Enables |
|---|---|---|
defer in PreToolUse[8][46] |
v2.1.89+ | Human-in-the-loop: pause execution for external input, resume with claude -p --resume <session-id> |
PermissionDenied hook[46] |
v2.1.89 | Fires on classifier denials; return retry: true to let Claude try different approach |
updatedPermissions field[8] |
Q1 2026 | Dynamically modify Claude's permissions from within a hook (addRules, replaceRules, removeRules, setMode, addDirectories) |
updatedMCPToolOutput[8] |
Q1 2026 | PostToolUse: override MCP tool output for post-processing/enrichment |
| MCP Tool handler type[9] | v2.1.118 | Call any MCP server tool from a hook; ${path} substitution in input |
sessionTitle in UserPromptSubmit[20] |
Week 15 | Hooks can SET the session title from hook output |
| PostToolUse execution timing[9] | v2.1.119 | Hooks can measure how long tool calls take |
| Hook output >50K to disk[46] | v2.1.86+ | Large hook output saved to disk with path + preview; prevents context flooding |
once: true flag[8] |
Q1 2026 | Run hook once per session then remove — for one-time setup operations |
$CLAUDE_ENV_FILE persistence[53] |
Q1 2026 | Persist env vars across hook firings (SessionStart, CwdChanged, FileChanged) |
if matcher[8] |
Week 13 | Fine-grained filtering — hook fires only when tool invocation pattern matches |
Key finding: The defer PreToolUse decision (v2.1.89) enables genuine human-in-the-loop patterns for CI/CD approval gates and multi-agent coordination — without polling. Combined with the MCP Tool handler type (v2.1.118), hooks can now orchestrate entire external service workflows from a single hook invocation.[8][9]
MCP servers can pause execution and request structured input from users via elicitation/create. Two modes: Form (structured JSON schema validation) and URL (redirect to external URL for OAuth, payment flows).[24] Hook integration: Elicitation and ElicitationResult events allow hooks to intercept and pre-answer elicitation requests programmatically.[24] Security constraint: servers MUST NOT require PII, credentials, or sensitive data through elicitation — sensitive operations must use URL mode.[24]
| Location | Scope |
|---|---|
~/.claude/settings.json[8] |
All projects (user-level) |
.claude/settings.json[8] |
Single project (git-tracked) |
.claude/settings.local.json[8] |
Single project (gitignored) |
| Managed policy[8] | Organization-wide |
Plugin hooks/hooks.json[44] |
When plugin enabled |
| Skill/Agent frontmatter[44] | Component lifecycle only |
3,600 stars, 602 forks. 13 demonstrated hook types.[14] Key patterns:
Claude Code shipped 5 major capability releases across Weeks 13–17 (March 23 – April 24, 2026): /ultraplan, Monitor Tool, /ultrareview, Routines, and Agent Teams, alongside a quality regression postmortem that revealed prompt changes have measurable intelligence impact.[20][9]
| Week / Version | Feature | Status | Impact |
|---|---|---|---|
| Week 13 (Mar 23–27)[9] | Auto mode research preview; PR auto-fix; PowerShell tool (Windows); conditional if hooks |
Research Preview | Windows parity; hook precision |
| Week 14 v2.1.86–91 (Mar 30–Apr 3)[46] | Computer Use in CLI (macOS); /powerup; MCP Result-Size → 500K chars; Plugin bin/ on PATH; defer PreToolUse; PermissionDenied hook |
Research Preview / GA | Human-in-loop; native app testing; large MCP payloads |
| Week 15 v2.1.98–101 (Apr 6–10)[20] | /ultraplan cloud planning; Monitor Tool; /autofix-pr; /team-onboarding; sessionTitle hook field |
Research Preview / GA | Eliminates sleep polling; autonomous CI fix loop |
| Week 16 (Apr 13–17)[9] | Opus 4.7 as new default; xhigh effort level; Routines (scheduled cloud agents); /ultrareview announced |
Research Preview | Higher baseline reasoning; scheduled automation |
| Week 17 v2.1.116–119 (Apr 20–24)[9] | /ultrareview public preview; custom themes; MCP tool via hooks; faster startup; model persistence; GitLab/Bitbucket PR URL support |
Research Preview / GA | Multi-agent review; startup <1s; Git forge parity |
30-minute remote planning sessions on Anthropic infrastructure running Opus 4.6. Local terminal polls every 3 seconds for updates. Browser-based review with inline comments + emoji reactions before execution. Plans can be "teleported" back to local working directory for execution. v2.1.101: First run auto-creates default cloud environment (no web setup needed).[15][20]
Public research preview since v2.1.86 (April 22, 2026).[12]
| Dimension | Detail |
|---|---|
| How to use[12] | /ultrareview (current branch vs default) or /ultrareview 1234 (PR #1234) |
| Architecture[12] | Setup (~90s, ~5 agents) → Find (parallel: race conditions, logic errors, type mismatches) → Verify (independent reproduction) → Result (verified bugs only) |
| Duration[12] | 5–10 minutes |
| Pricing[12] | Pro/Max: 3 free runs through May 5, 2026. Team/Enterprise: billed from start. Typical: $5–$20 per review |
| Limitations[12] | Not on Bedrock, Vertex, Foundry, Zero Data Retention orgs. Requires claude.ai account |
Background watcher streaming events into conversation as transcript messages Claude reacts to immediately. Eliminates Bash sleep polling loops. Monitors: log files, CI/CD pipeline results, dev server crashes. Integration with /loop: Claude now self-paces — schedules next tick based on task, or uses Monitor to skip polling entirely.[20]
Saved Claude Code configurations running automatically on Anthropic-managed cloud infrastructure. Three trigger types (combinable): Schedule (hourly, daily, weekdays, weekly, or one-off timestamp), API (HTTP POST to per-routine endpoint with bearer token), GitHub (PR events with filter conditions: author, title, body, base branch, head branch, labels, is_draft, is_merged).[21]
Usage limits: Pro = 5/day, Max = 25/day scheduled runs; GitHub webhook and one-off runs do NOT count toward limit.[21] Repositories cloned fresh per run. Creates claude/-prefixed branches by default.[21]
Enable: CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1. Requires v2.1.32+.[63] Architecture: Team lead + Teammates (separate Claude Code instances, own context windows). Shared task list with dependency tracking + file-locking. Mailbox for direct agent-to-agent messages.[63]
vs Subagents: Subagents report to main agent only; teammates communicate directly with each other.[63]
Quality gate hook integration: TeammateIdle (exit 2 = send feedback, keep working), TaskCreated (exit 2 = prevent task creation), TaskCompleted (exit 2 = prevent task completion).[63]
Limitations: No session resumption with in-process teammates, task status can lag, no nested teams, no per-teammate permission modes at spawn.[63]
Three simultaneous quality issues discovered and fixed in April 2026:[9]
Key finding: "Length limits: keep text between tool calls to ≤25 words" caused measurable intelligence drops. Lesson: prompt changes have detectable, measurable quality impact.[9]
Key finding: The Monitor Tool and Routines together shift Claude Code from interactive-only to autonomous — agents can now self-schedule, react to external events (GitHub PRs, Sentry alerts via API trigger), and run nightly tasks without any human trigger.[20][21]
Anthropic launched Managed Agents public beta (April 8, 2026) with built-in filesystem memory (April 23, 2026), decoupled the agent Brain from Hands via durable server-side session events, and leaked plans for KAIROS persistent memory (May 2026), Coordinator Mode orchestration, and Daemon Mode persistent sessions.[13][49][61][15]
Public beta launched April 8, 2026. Pricing: standard token rates + $0.08 per session-hour for active runtime.[49]
| Concept | Description |
|---|---|
| Agent[49] | Model + system prompt + tools + MCP servers + skills |
| Environment[49] | Configured container (packages, network access, mounted files) |
| Session[49] | Running agent instance performing a specific task |
| Events[49] | Messages between app and agent — persisted server-side, outside agent sandbox |
Decoupled architecture (raw_13.md): Separates Brain (Claude + harness), Hands (sandboxes), and Session (event log stored OUTSIDE both). Events can be transformed in harness before passing to Claude (context organization + prompt caching optimization).[13]
Performance from decoupling: P50 TTFT ~60% reduction; P95 TTFT >90% reduction.[13]
Credential isolation: Two mechanisms — resource-bundled auth (embedded in remote URLs) + external vault + proxy pattern — prevent prompt injection from compromising credentials.[13]
Built-in tools: Bash, file operations, web search/fetch, MCP servers, sub-agent spawning (research preview). Built-in prompt caching and context compaction (automatic).[49]
Filesystem-based persistent memory — workspace-scoped Markdown files agents read/write/update across sessions.[61]
Real-world results: Rakuten: 97% fewer first-pass errors, 27% lower cost, 34% lower latency. Wisedocs: 30% faster document verification.[61]
Key finding: Anthropic's managed agent memory achieves Rakuten's 97% reduction in first-pass errors via filesystem-scoped Markdown that survives session resets — not a vector database, not a proprietary API: plain files with audit trails.[61]
Architecture from March 31, 2026 source map leak: daily append-only markdown logs at ~/.claude/.../logs/YYYY/MM/DD.md.[15]
Auto-Dream Consolidation Cycle (4 phases):[15]
Triggers after 24 hours AND minimum 5 sessions. 15-second blocking budget per phase before auto-backgrounding. Exclusive KAIROS tools: SendUserFile, PushNotification, SubscribePR, SleepTool. Teaser April 1–7, 2026. Full launch planned May 2026.[15]
One Claude as coordinator, spawning isolated worker instances with dedicated scratch directories. XML-based <task-notification> protocol: status, summaries, token usage, duration metrics. 4-phase workflow: research → specification → implementation → verification. Activation: CLAUDE_CODE_COORDINATOR_MODE=1. Status: requires additional development before release.[15]
| Feature | Description |
|---|---|
| Daemon Mode[15] | Background persistent sessions in tmux: claude --bg, claude daemon ps |
| UDS Inbox[15] | Unix domain sockets for inter-process communication between Claude instances |
| Bridge Mode[15] | Remote control from phones/browsers via WebSocket |
| Unreleased models[15] | opus-4-7, sonnet-4-8; codenames: Capybara, Fennec, Numbat |
| 26 slash commands[15] | Including /ctx-viz (context visualization), /btw (side questions), /bughunter |
Controversial leaked features (treat as unverified): "Undercover Mode" (CLAUDE_CODE_UNDERCOVER) strips AI evidence from git commits; "Anti-Distillation" injects fake tools to poison competitor training pipelines; Frustration Detection monitors negative language and adapts tone.[15]
MCP donated to the Linux Foundation's Agentic AI Foundation in December 2025. OpenAI and Google DeepMind adopted MCP in early 2025. Registry: 400+ community-built servers, top 50 averaging 12,000+ monthly installs each.[50]
Claude Design (April 17, 2026): New Anthropic Labs product for collaborative visual creation of designs, prototypes, slides, and one-pagers. Separate from Claude Code but integrated into the Figma MCP loop.[9]
The MCP ecosystem reached 70+ production-ready servers across 12 categories in 2026, with 10,000+ total MCP servers listed across directories (most are weekend projects).[16] Remote/hosted MCP deployment emerged as the dominant Q1 2026 trend — Azure 2.0, VictoriaMetrics, Supabase, Sentry, and Jira all moved to remote OAuth, eliminating local installation.[16]
| Category | New Arrivals (Q1-Q2 2026) | Notable |
|---|---|---|
| Cloud Infrastructure[16] | Azure MCP Server 2.0 (Microsoft) | Remote HTTP, dramatically expanded Azure coverage, security hardening |
| Project Management[16] | Asana, Shortcut, Plane, Smartsheet, Wrike (all official, Feb–Apr 2026); ClickUp expanded 6 → 49 tools | Full PM category now official-server covered |
| Metrics[16] | VictoriaMetrics Hosted MCP Server | No local server needed — pure remote |
| Observability[16] | Datadog, Grafana, Sentry, Prometheus, New Relic APM | Now recognized dedicated category |
| Vector & RAG[16] | Qdrant, Pinecone, Weaviate, Chroma, LlamaIndex | Emerging enterprise category |
| Enterprise[16] | Stripe, HubSpot, Salesforce, Snowflake, BigQuery, Docker, Kubernetes, Terraform | All production-grade with official support |
| Tool | Stars | Function |
|---|---|---|
| MindsDB[16] | 39,000 | Connects and unifies data across platforms and databases |
| Agent-Browser (Vercel)[10] | 29,500+ | 14.7× token reduction browser automation |
| Chrome DevTools MCP[10] | 37,400 | Official Google debugging/performance MCP |
| claude-plugins-official[7] | 18,000 | Anthropic's official plugin/skill directory |
| MetaMCP[23] | 2,200 | Unified MCP middleware with GUI tool picker |
| AnyQuery[16] | 1,700 | Query 40+ apps with SQL |
Based on the Q1–Q2 2026 ecosystem state, the following are definitively shipping — do not build custom alternatives:
| Problem Domain | What Exists | Maturity | Action |
|---|---|---|---|
| Accessibility scanning in IDE[3] | Deque Axe MCP Server (GA Feb 2026) | GA, enterprise-backed | Use if you have Axe DevTools license; otherwise use Community-Access OSS bundle |
| Design token extraction from Figma[4] | Figma MCP Server (official, hosted OAuth) | Production since June 2025 | Use directly; no custom extractor needed |
| Code → Figma capture[37] | Figma generate_figma_design tool (Feb 17, 2026) |
GA | Use; no custom screenshot-to-Figma pipeline needed |
| Design system rules in CLAUDE.md[38] | Figma "Create Design System Rules" skill (official) | GA | Run skill; do not hand-write design system rules |
| Claude Code telemetry / OTel export[29] | Built-in OTLP (CLAUDE_CODE_ENABLE_TELEMETRY=1) | Stable (metrics/logs); Beta (traces) | Configure env vars; no wrapper needed |
| Hook human-in-the-loop approval[8] | defer decision in PreToolUse (v2.1.89) |
GA | Use defer; do not build approval polling infrastructure |
| Scheduled autonomous agents[21] | Routines (research preview, April 2026) | Research Preview | Use for non-critical workflows; build on stable API for production |
| Cloud-based code review[12] | /ultrareview ($5–$20/review, 5–10 min) |
Research Preview | Use; do not build multi-agent review pipeline — it ships |
| MCP context exhaustion[34] | MetaMCP (2,200 stars) for tool picking; deferred tool schemas already in Claude Code | Production (MetaMCP) / Shipping (lazy loading) | Use MetaMCP gateway; deferred schemas already active in current sessions |
| Persistent cross-session memory[61][15] | Managed Agent Memory (Beta Apr 23) + KAIROS (May 2026 launch) | Beta / Imminent | Evaluate Managed Agent memory now; wait on KAIROS for local CLI memory |
| Background process monitoring without polling[20] | Monitor Tool (v2.1.98) | GA | Use Monitor tool; delete all sleep-loop patterns |
| Component isolation testing (React)[27] | Storybook MCP Server (v10.3+) | Production (React only) | Use if on React; watch for Vue/Angular support |
Key finding: The 2026 Q1–Q2 release cadence compressed 2+ years of expected tooling gaps into a single quarter. Teams building custom observability wrappers, approval-gate systems, scheduled agent runners, or cloud review pipelines are duplicating infrastructure that Anthropic ships at a fraction of the cost and with direct platform support.[20][12][21][29]