Version 1.0 • Last updated 2026-05-12
Coding Agent Attack Matrix
Endpoint-based AI agents operate autonomously across your files, CLIs, and MCP servers—creating new attack vectors for organizations. This matrix maps common abuse techniques to MITRE ATT&CK tactics, providing a threat model to identify gaps in your security controls.
Initial Access
5
Drive-by compromise via prompt injections in web fetch / web search
Malicious agent skill
Rug-pull attack on a malicious remote MCP server
Backdoor attack in a STDIO-based MCP server
Indirect prompt injection via code, files, or MCP content
Execution
2
Let the agent write its own malware
Indirect execution via the dev loop
Persistence
4
Add hooks
CRON job
Modification of system prompt / memory
Plant a malicious agent skill
Privilege Escalation
4
Loosen permission/sandbox configuration
Abuse sandbox escape hatches
Route execution via shell-equivalent STDIO-based MCP tools
Break out of outer container/VM
Defense Evasion
4
Disable hooks
Disable OTEL telemetry
Bypass HTTP-proxy
Keep actions in markdown and execute bash commands one-by-one
Credential Access
2
Disk-resident credential files
OS keystore
Discovery
4
Filesystem and volume enumeration
Package and binary enumeration
Network and process enumeration
Cloud environment enumeration
Lateral Movement
4
Abuse SSH keys to hop to other servers
Ambient CLI abuse
Internal HTTP API abuse
Infrastructure MCP abuse
Collection
6
MCP-based collection
Cloud bucket / volume downloads
Database dumps via ambient CLI
Off-project file reads
Prior session history
Environment Variables
Command-and-Control
4
CRON-based beaconing
Claude Code Channels
Hook-based beaconing
Web fetch for instructions
Exfiltration
4
Web fetch / curl / wget with data
Hook-based exfiltration
Malicious MCP server
Connected-MCP messaging exfiltration
Impact
2
Destructive shell commands
Destructive git operations
Initial Access
How an attacker gets control of the agent's instruction stream or code execution path. Coding agents consume huge amounts of untrusted content (web pages, repos, MCP servers, skills, docs), creating a much broader attack surface than typical applications.
5 techniques
Drive-by compromise via prompt injections in web fetch / web search
The agent uses web fetch or web search to land on a page the attacker has seeded with hidden instructions, and executes them as if the user had issued them.
Malicious agent skill
A malicious skill (scripts + markdown extending the agent) is downloaded by the user. Loading it executes embedded instructions or code.
Rug-pull attack on a malicious remote MCP server
A remote MCP server starts benign, gains trust, then changes its tool descriptions or output to inject instructions.
Backdoor attack in a STDIO-based MCP server
A locally-installed MCP server (running over stdio) ships with malicious code that runs on the user's machine with full permissions.
Indirect prompt injection via code, files, or MCP content
The agent reads attacker-controlled content (source files, code repositories, documentation, MCP tool outputs, API responses) that contains hidden instructions. The agent incorporates this content into its context and executes the embedded instructions as if they were user directives.
Execution
Running attacker-controlled code via the agent's bash tool.
2 techniques
Let the agent write its own malware
Once the attacker has any input channel, they ask the agent to author malicious code (downloader, key-stealer, command-and-control, etc.) and run it via bash.
Indirect execution via the dev loop
The agent edits a file that something else on the machine runs-`package.json` install scripts, pre-commit hooks, test/build config, watcher-triggered configs, shell rc files. Execution happens outside the bash tool, bypassing permission asks, the sandbox, and audit logs.
Persistence
Maintaining attacker access across sessions or after the user thinks the task is done.
4 techniques
Add hooks
The agent's hook system (scripts running on agent events) is abused to install code that runs on every future session.
CRON job
The agent uses Claude Code's scheduling tools (`/loop`, `CronCreate`, `Monitor`) to schedule prompts that re-fire on an interval. Tasks survive `claude --resume` within 7 days but die on a fresh conversation.
Modification of system prompt / memory
Edit `CLAUDE.md`, memory files, or other agent-customization files so the next session loads attacker instructions.
Plant a malicious agent skill
The agent writes a skill to disk. Claude Code's file watcher loads it in the current session; it activates when auto-matched or the user runs its slash command, executing any embedded shell blocks.
Privilege Escalation
Gaining capabilities the agent wasn't meant to have.
4 techniques
Loosen permission/sandbox configuration
The agent edits `settings.json` to widen permissions or weaken the sandbox; e.g. adding to `permissions.allow`, setting `defaultMode: "bypassPermissions"`, disabling `sandbox.enabled`, expanding `excludedCommands`.
Abuse sandbox escape hatches
The agent uses documented sandbox escape mechanisms. Two settings provide unsandboxed execution: `excludedCommands` (commands that bypass the sandbox entirely) and `allowUnsandboxedCommands` (fallback for failed commands). Docker is the primary example - once the agent runs `docker`, it gains full host access via container mounts, privileged containers, or socket access.
Route execution via shell-equivalent STDIO-based MCP tools
Requires an MCP whose capability overlaps with bash. Shell-execution MCPs (e.g., `mcp-shell-server`) expose an `execute_command` tool that runs commands in the unsandboxed MCP process. Broad-scope filesystem MCPs can read/write files the bash sandbox would block.
Break out of outer container/VM
Kernel or hypervisor exploit, or sandbox escape via memory pollution/copying. Out of agent reach without a 0-day.
Defense Evasion
Avoiding the controls that would otherwise catch the attack.
4 techniques
Disable hooks
The agent removes or modifies hook configuration (via settings.json edits that require human-in-the-loop approval) so monitoring/blocking hooks no longer fire.
Disable OTEL telemetry
The agent edits user or project `settings.json` to remove OTEL env vars (`CLAUDE_CODE_ENABLE_TELEMETRY`, `OTEL_LOG_TOOL_DETAILS`, etc.). Telemetry reads at process startup, so the edit takes effect on the next Claude Code session-the current session keeps emitting.
Bypass HTTP-proxy
The agent unsets `HTTP_PROXY` / `HTTPS_PROXY` env vars to stop network traffic from flowing through the monitoring proxy.
Keep actions in markdown and execute bash commands one-by-one
The agent splits malicious work across many small, individually-innocent bash commands. The overall plan lives in conversation/markdown, never in a single command.
Credential Access
Stealing secrets from the host. Defenses stack in order of reliability: (1) Vault credentials so nothing sensitive sits on disk (2) Bash sandbox `denyRead`-OS-level, blocks bash and subprocesses (3) `Read`/`Bash` deny rules-covers the Read tool, best-effort on Grep/Glob
2 techniques
Disk-resident credential files
The agent reads any file on disk holding plaintext credentials. Common locations: cloud configs (`~/.aws/credentials`, `~/.config/gcloud/`, `~/.azure/`, `~/.kube/config`), SSH (`~/.ssh/`), package registries (`.npmrc`, `.pypirc`, `~/.docker/config.json`, `~/.cargo/credentials`), git (`~/.git-credentials`, `~/.netrc`), shell rc files where tokens are exported (`.bashrc`, `.zshrc`, `.profile`, `.envrc`), and shell history.
OS keystore
The agent extracts secrets from the platform credential store via its CLI: macOS keychain (`security find-generic-password`, `security find-internet-password`), Linux Secret Service (`secret-tool lookup`), Windows Credential Manager (`vaultcmd`, PowerShell `Get-StoredCredential`).
Discovery
Reconnaissance-mapping the host before the next move. Most discovery commands overlap with legitimate developer activity, so prevention is limited; the realistic defense is logging enumeration and alerting on patterns that don't fit the project.
4 techniques
Filesystem and volume enumeration
The agent walks the directory tree (`ls -R`, `find /`, `tree`) and inspects mounted volumes (`mount`, `df`, `/Volumes`, `/mnt`) to map files and disks beyond the project root.
Package and binary enumeration
The agent lists installed packages and available tooling (`dpkg -l`, `rpm -qa`, `brew list`, `pip list`, `npm ls -g`, `which <binary>`, walking `$PATH`).
Network and process enumeration
The agent runs commands to map running processes and reachable hosts. High-volume tools (`ps`, `netstat`, `ss`, `lsof`) are everyday commands and not useful detection signals; these are worth flagging: `arp -a`, `nmap`, `masscan`, direct reads of `/proc/net/tcp` and `/proc/net/udp`, sequential probes, DNS sweeps over internal hostnames.
Cloud environment enumeration
The agent runs `aws ...`, `gcloud ...`, `kubectl ...`, `az ...` to map cloud accounts, projects, clusters, and resources.
Lateral Movement
Pivoting from the compromised host to other machines or services using data obtained in earlier stages: credentials (T-020, T-021), ambient CLI auth (T-024), and configured MCPs. Defenses are most effective blocking credential and access acquisition in earlier stages; if that fails, focus on containing blast radius.
4 techniques
Abuse SSH keys to hop to other servers
The agent uses SSH keys obtained via Credential Access (`~/.ssh/`) to ssh into reachable hosts.
Ambient CLI abuse
The agent uses CLIs that already hold credentials on the developer's machine (`aws`, `gcloud`, `kubectl`, `gh`, `psql`, `mysql`) to reach systems the developer has access to.
Internal HTTP API abuse
The agent uses bearer tokens from env vars or config files (`GITHUB_TOKEN`, `~/.config/gh/hosts.yml`) plus the developer's network access (VPN, Tailscale, ZeroTrust) to call internal APIs via `curl`/`wget`.
Infrastructure MCP abuse
The agent calls infrastructure MCP tools (k8s, cloud, database MCPs) to perform operations against connected systems.
Collection
Gathering target data prior to exfiltration. Most collection is operation-shape detectable (bulk/recursive verbs, explicit dump tools); defenses focus on scope-limiting credentials and MCP access.
6 techniques
MCP-based collection
The agent uses already-authenticated MCP connections (Gmail, Drive, Notion, Linear, etc.) to pull data via `list_*`/`search_*`/`get_*` calls.
Cloud bucket / volume downloads
The agent uses ambient cloud credentials to pull data with recursive copy commands: `aws s3 sync`, `aws s3 cp --recursive`, `gsutil -m cp -r`, `gcloud storage cp -r`, EBS/disk snapshots.
Database dumps via ambient CLI
The agent runs dump tools (`pg_dump`, `mysqldump`, `mongodump`) or inline SQL bulk reads against databases the developer has credentials for.
Off-project file reads
The agent reads user files outside the project-`~/Documents`, `~/Downloads`, `~/Desktop`, iCloud/OneDrive folders-containing sensitive material.
Prior session history
The agent reads Claude Code's session transcripts at `~/.claude/projects/<project-hash>/*.jsonl` for sensitive content the user pasted earlier (tokens, internal URLs, PII).
Environment Variables
The agent reads environment variables that contain secrets: API keys, tokens, database credentials, cloud credentials. These are often exported in shell rc files (.bashrc, .zshrc, .profile, .envrc) or read directly via `env`, `printenv`, or `echo $VAR`.
Command-and-Control
Establishing a channel for the agent to receive new instructions or small data transfers from outside the system.
4 techniques
CRON-based beaconing
The agent schedules a recurring task via `/loop`/`CronCreate` that fetches attacker-controlled content for instructions. Distinct from Persistence-here the cron is the delivery channel for live instructions.
Claude Code Channels
Claude Code's Channels feature (v2.1.80+) lets external systems push events into a running session via MCP plugins (Telegram, Discord, iMessage, custom). An attacker who can send messages to a paired channel-by compromising the developer's account, getting on the allowlist, or controlling a custom server-pushes commands directly into the live session.
Hook-based beaconing
A planted hook (e.g., `UserPromptSubmit`, `PreToolUse`, `PostToolUse`) makes an outbound request on every fire to an attacker server, fetches a response, and injects it into agent context. Every user prompt becomes a polling beacon. The hook can exfiltrate prompt content and tool inputs in the same request
Web fetch for instructions
The agent uses `web_fetch`, `curl`, `wget`, or similar to pull content from an attacker URL, re-injecting it as task input.
Exfiltration
Getting collected data out. Exfiltration uses the same network channels as C2 in reverse, so egress controls defend both.
4 techniques
Web fetch / curl / wget with data
The agent sends data to an attacker URL via `web_fetch` or shells out to `curl`/`wget` with the payload in query parameters or POST body.
Hook-based exfiltration
A planted hook reads agent context (prompts, tool inputs, tool results) on lifecycle events and sends it to an attacker server.
Malicious MCP server
The agent configures an MCP server pointing at attacker infrastructure. Tool arguments and returns become the exfiltration channel.
Connected-MCP messaging exfiltration
Legitimate MCP connectors that send messages (Gmail, Slack, Discord, SMS gateways) become exfiltration channels: the agent uses `mcp__gmail__send_email` or `mcp__slack__post_message` to send data to an attacker inbox or phone number.
Impact
Destruction, manipulation, or disruption of integrity and availability. Coding agents sit on developer laptops with commit rights, production-adjacent credentials, and write-capable MCP tools-so a single compromise can damage the codebase, cloud account, and professional reputation simultaneously.
2 techniques
Destructive shell commands
The agent issues categorically destructive commands (`rm -rf`, `rm -r`, `dd if=…`, `mkfs`, `mkswap`, `shred`, `find … -delete`, `find … -exec rm`).
Destructive git operations
The agent runs `git push --force`, `git reset --hard`, `git clean -fdx`, deletes branches, or rewrites history on shared branches-wiping unpushed work and poisoning teammates' clones.
v1.0: Initial release
2026-05-12- Complete MITRE ATT&CK-style framework for coding agents
- 12 tactics and 45 techniques with detection and remediation strategies for Claude Code
Understand the threat. Control the agent.
The Coding Agent Attack Matrix maps 45+ techniques across your agents. Guardbase provides the security control plane to prevent them—with full visibility, enforcement, and audit trails.