Why This List Matters
AI agents are no longer a toy. Teams are wiring them to real databases, email systems, APIs, and code execution environments — and most deployments treat security as an afterthought. These aren't hypothetical risks lifted from a threat model whitepaper. They're patterns that show up in production the moment an agent gets write access to something it shouldn't have. If you're building or running agents on platforms like n8n, Activepieces, Relevance AI, or Voiceflow, this list is for you.
1. Prompt Injection — The SQL Injection of AI Agents
Prompt injection is the most actively exploited attack vector in deployed AI agents right now. The attack works like this: malicious instructions are embedded in external content that the agent processes — a webpage it scrapes, an email it reads, a support ticket it summarizes, a document it parses. That content rewrites the agent's instructions at runtime, redirecting its behavior without any code change or authentication bypass.
Unlike SQL injection, there's no parameterized query equivalent. You can't simply escape the input and call it handled. The defense requires a layered approach: treat all external content as untrusted data, not as instructions. Validate and sanitize inputs before they enter the agent's context window. Add output filtering to catch unexpected patterns — an agent summarizing support tickets shouldn't be generating API calls or writing to databases. Consider architectural separation between the agent's instruction context and the data it processes, so external content can never override system-level directives.
This is the risk that makes "give the agent read access only where possible" a security requirement, not a preference. A read-only agent that gets prompt-injected is annoying. A write-capable agent that gets prompt-injected is a breach.
2. Over-Privileged Tool Access — Agents That Can Do More Than They Should
The principle of least privilege applies to AI agents exactly the same way it applies to IAM roles, service accounts, and database users. If your agent has write access to a production database but only needs to read from it, that's not a configuration detail — that's a misconfiguration waiting to be exploited.
Define tool permissions at the narrowest scope possible before you deploy. A summarization agent doesn't need DELETE on anything. An email drafting agent doesn't need access to your billing API. A customer support agent doesn't need to be able to push to your Git repo. Most agent platforms let you define tool lists per-agent — use that feature as a security boundary, not just an organizational one.
The practical test: before giving an agent a tool, ask whether removing it would break the agent's core job. If the answer is no, don't include it. Treat tool access like firewall rules — default deny, explicit allow.
3. Secrets Leaking Into Agent Context
API keys, database connection strings, tokens, and passwords that get passed into agent prompts — or worse, logged in tool call histories — are a quiet but serious problem. I've seen n8n workflows where the database connection string was hardcoded directly into the system prompt, fully visible in execution logs and in the workflow export JSON. Anyone with read access to that n8n instance had the database credentials.
Secrets belong in environment variables, loaded at runtime, never in prompt text. Use a proper secrets manager — HashiCorp Vault, Doppler, or even a well-managed .env file — and reference secrets by variable name in your agent configurations. Make sure your logging pipeline is configured to redact secrets before writing to any log store. Most agent platforms have credential vaults built in — use them instead of pasting keys into prompt fields.
The rule: if a secret appears in a context window, an agent response, or a log file, it's already compromised. Rotate it immediately and fix the source.
4. Uncontrolled External API Calls — Data Walking Out the Door
Agents with HTTP tool access can make outbound requests to arbitrary endpoints. A compromised or prompt-injected agent can silently POST your internal data to an attacker-controlled server. Because the request comes from your infrastructure, it may bypass monitoring that watches for external intrusion — the data leak looks like normal outbound traffic.
Enforce egress filtering at the network level, not just at the application layer. On a self-hosted setup — an n8n instance on a Hetzner or Vultr VPS, for example — an iptables allowlist can restrict outbound connections to specific domains and IPs. Your agent orchestration layer shouldn't be able to reach arbitrary internet endpoints. Define the whitelist of services your agents legitimately call (your CRM, your email API, your database) and block everything else at the firewall.
This is one of the clearest arguments for self-hosting your agent infrastructure rather than running it in a shared cloud environment where you have no control over egress rules.
5. No Audit Trail for Agent Actions
When something goes wrong — and it will — you need to know exactly what the agent did, in what order, with what inputs and outputs, triggered by what event. Most teams deploying AI agents in 2026 have zero structured logging on agent actions. They might have LLM API logs, but not a complete action trace that links a user request to every tool call the agent made downstream.
Every tool call an agent makes should be logged with: a timestamp, the agent ID, the triggering event or user, the tool name, the input parameters, the output received, and the execution duration. This is your incident response capability. Without it, you're flying blind when an agent does something unexpected — and the window for diagnosing what happened closes fast as logs roll over.
If you're self-hosting on n8n, the execution history gives you a reasonable starting point, but pipe it to a centralized log store (Loki, CloudWatch, Datadog) so you have retention and search beyond the platform's default window. For cloud platforms like Relevance AI or Voiceflow, export audit logs via their API on a schedule and store them somewhere you control.
6. Runaway Loops and Resource Exhaustion
An agent that gets confused about its goal — or that enters a retry loop because a tool keeps returning errors — can run indefinitely, calling tools, generating tokens, and consuming API credits or VPS CPU until it crashes your instance or your bill hits four figures. I've had n8n workflows loop for 20 minutes burning through OpenAI tokens before I caught it. It's not catastrophic, but it's expensive and embarrassing.
Implement hard limits at every layer: maximum tool calls per execution run, maximum tokens per step, timeout thresholds on the overall workflow. Most automation platforms support execution timeouts — set them to something sane (5 minutes is generous for most tasks). Set budget alerts on your LLM API keys so a runaway agent triggers an alarm before it does serious financial damage. For self-hosted models, set CPU and memory limits on the container so a runaway process can't starve the rest of your stack.
Don't assume the LLM will self-terminate gracefully when it gets stuck. It won't. Build the guardrails into the orchestration layer.
7. Insecure Code Execution — Sandbox Escapes in the Wild
If your agent can execute code — via a Python tool node, a Bash executor, a JavaScript sandbox, or a built-in code execution step — that's a potential sandbox escape. A prompt-injected or maliciously crafted input can direct the agent to run arbitrary commands on the host system, read files it shouldn't access, or establish outbound connections that your network policy didn't anticipate.
Code execution nodes should always run in isolated containers with a strict security profile: no access to the host filesystem outside a defined temp directory, no outbound network access, CPU and memory hard limits, and a short execution timeout. Don't run code execution on the same VPS as your production database or your main application — keep it on a separate, disposable instance that you can kill and rebuild without touching anything critical.
The threat model here is the same as eval() in web security: if you give an agent the ability to run arbitrary code, treat it with the same paranoia you'd apply to user-submitted JavaScript in a browser.
8. Third-Party Integration Supply Chain Risk
Agents connect to dozens of external services through pre-built integrations — CRMs, email platforms, databases, file storage, payment systems. Each integration is a dependency, and each dependency is a trust relationship. If an upstream integration is compromised — a malicious update to a connector package, a misconfigured OAuth token, a breached third-party vendor — your agent inherits that compromise automatically.
Audit the integrations your agents use with the same rigor you'd apply to npm dependencies. Rotate credentials on a regular schedule, not just when a breach is reported. Scope OAuth tokens to the minimum required permissions — a cold email agent doesn't need full write access to your CRM, it needs to create contacts and log email activity, nothing else. Review integration permission scopes when you first connect them and again when the vendor releases an update that requires re-authorization.
For self-hosted automation platforms, keep the platform and its integration packages updated. Unpatched integration packages in a self-hosted n8n instance are a real attack surface — especially for integrations to high-value targets like payment processors or identity providers.
9. No Human-in-the-Loop for Irreversible Actions
The most dangerous thing an AI agent can do is take an action that cannot be undone: deleting database records, sending emails to thousands of contacts, pushing code to production, transferring money, revoking user access. Fully autonomous agents operating on irreversible actions without a review step are the single biggest risk category for real organizational damage — not from attackers, but from the agent doing exactly what it was told in a way nobody intended.
Design your agent workflows so that irreversible actions require explicit human confirmation. This doesn't mean abandoning automation — it means adding a gate before the point of no return. Common patterns: a Slack approval step that must be acknowledged before the action fires; a review queue where a human can inspect the planned action and approve or reject; a webhook confirmation that requires a signed response within a timeout window. If the confirmation isn't received, the action doesn't execute.
The framing: automate the repetitive, gate the irreversible. Every agent workflow should have a clear map of which actions are reversible (you can undo a draft) and which are not (you cannot unsend 10,000 emails). Apply confirmation gates to the irreversible ones, full stop. No exceptions for "this one is usually fine."
Key Takeaway
AI agents inherit the entire attack surface of every system they connect to, plus their own unique risks on top. A compromised agent isn't just a chatbot giving bad answers — it's an authenticated actor with API access to your infrastructure, acting autonomously, at speed, without a human watching every step. Treat agent security the same way you'd treat any privileged service account: least privilege, audit trails, hard limits, and defense in depth. The tools to build secure agent deployments exist today. The question is whether you apply them before an incident, or after.