Guardbase Launches Coding Agent Attack Matrix
Most threat-modeling work on coding agents stops at “prompt injection.” That’s the entry point, not the impact. The interesting question is what happens after; how a compromised agent uses native functionality (web_fetch, mcp__gmail__send_email, aws s3 sync, code execution) to write and execute malware, dump production databases, or pivot laterally. Often bypassing EDR, because the agent isn’t malware; it’s the developer (or analyst, or HR admin), behaving normally, with different intent.
We built the Coding Agent Attack Matrix to answer that question systematically. It maps 40+ techniques across the 12 MITRE ATT&CK tactics, specific to endpoint agents; Claude Code first, extensible to others. Each technique includes concrete detection strategies and defenses specific to the agent layer.
Scope: Endpoint agents, any user, local impact
This framework covers endpoint AI agents that run on developers’, analysts’, sales reps’, and HR teams’ laptops (e.g. Claude Code/Cowork, Gemini, Cursor, OpenClaw). It assumes compromise via supply-chain attacks (malicious skills/MCP servers) or prompt injection (untrusted content from web search, etc.), and focuses on how the compromised agent can then compromise the endpoint and connected resources (cloud accounts, internal networks, etc.). The threat is agent-native: what it can do using its built-in functionality, not lower-level OS exploits.
Existing frameworks don’t help much. MITRE ATT&CK is product-agnostic and predates agentic AI; MITRE ATLAS covers threats to AI and ML systems broadly. But endpoint agent threats are different. The Coding Agent Attack Matrix is purpose-built for a narrower, more specific threat: coding agents that run on developer laptops (Claude Code, Cursor, Aider, etc.) and the endpoint/cloud resources they can compromise.
These frameworks are valuable for their scope, but they don’t tell you that a Claude Code agent can edit settings.json’s env block to disable your OTEL telemetry on the next session start - making the rest of your detection blind - or which setting prevents it. And because each agent has different capabilities, different detection hooks, and different remediation options, this framework is built per-agent.
Each technique in the matrix maps to concrete Claude Code settings, OTEL events, and detection grading. Strong defenses get named (allowManagedHooksOnly, sandbox denyRead). Best-effort defenses are flagged as best-effort. Threats that aren’t detectable at the agent layer are marked out of scope rather than hidden behind hopeful language.
Out of scope by design: traditional Active Directory traversal, physical attacks, compromises of the underlying OS kernel, attacks that require privileges the user doesn’t have. Not because those don’t matter, they do, but because endpoint agents often make them unnecessary. Why traverse a network when the agent on the laptop already has cloud admin? Why escalate when the developer already has sudo?
Feedback welcome. If we missed a technique or have a suggestion for Claude Code, please reach out. We’re expanding this framework to other endpoint agents, so if you’re building or securing a similar agent, get in touch.