

If you’ve been playing with LLMs lately, you’ve probably noticed how easy it is to wire them into the rest of your stack. A few lines of config here, a new server there, and suddenly your model can talk to your cloud provider, your ticketing system, or your internal tools.
The problem is that it’s also never been easier to accidentally expose something you really care about.
We’re already seeing how quickly this can go wrong with things like OpenClaw, where people wired agents into sensitive systems and only realized the risks once it was too late. Sometimes after losing important data like entire email inboxes.
AI is a power‑up in both directions. It can make your workflows way better, or it can make your mistakes way worse. A tool you thought was “just for testing” can suddenly be called at scale by an enthusiastic agent that doesn’t understand the difference between safe and dangerous.
In this post, I want to unpack what MCP servers actually are, why they deserve a proper security conversation, and some practical steps to make them less scary to run in the real world.
What is MCP, and why does security matter?
TL;DR: MCP servers act as gateways that let LLMs call tools and systems, so weak permissions or poor controls can expose critical infrastructure.
An MCP server is a gateway that exposes tools and data sources to LLMs in a controlled way, turning natural‑language requests into concrete actions.
Instead of hard‑coding every API call in your application, you define tools that live behind an MCP server.
The model or agent doesn’t talk directly to your cloud provider or database; it talks to the MCP server, which decides which tool to call, passes parameters, and returns the result back to the model.
In practice, this means the MCP server sits right between “ask the model to do something” and “something actually happens in your systems.” That’s the whole point: let the model orchestrate behavior by calling tools, instead of just generating text.
From a security point of view, every one of those tools is a door into something you might regret exposing. If the MCP server has weak auth, overly broad permissions, or little visibility, then prompt injection, buggy prompts, or just poorly designed agents can push those tools much harder than you expected.
Once an MCP server can talk to real systems, you’re not just doing prompt engineering anymore. You’re managing a serious piece of your security surface.
At a practical level, an MCP server is just the middleman between your model and the outside world.
How MCP servers work
TL;DR: MCP servers sit between the model and external systems, turning tool requests into real API calls, database queries, or cloud actions.
At a practical level, an MCP server is just the middleman between your model and the outside world. The model sees a list of capabilities, picks one based on the conversation, and sends a structured request back to the MCP server.
From there, the server turns that request into a real call: an HTTP request to an internal API, a query to a database, an operation against your cloud provider, or whatever you wired it to do. That loop can happen once, or it can chain several tools together based on how the agent interprets the task.
The reason I compare MCP servers to high‑value API gateways is that the security story is basically the same, just with more autonomy on top.
If you give that agent access to powerful tools behind an MCP server, you have to assume it will eventually try every path you allow, not just the ones you imagined.
The biggest MCP server risks
TL;DR: The biggest risks include overly permissive tools, data exfiltration, and agents accidentally causing large-scale damage to connected systems.
Once you take a closer look at MCP, a few obvious risk categories pop out.
1. Overly permissive tools
Giving reasonably broad permissions to an agent can seem like a good idea at first. But then you forget, the configuration spreads, and you’ve effectively handed a very eager agent the ability to make big changes with a single bad prompt.
2. Data exfiltration
Another risk is data exfiltration. If a tool exposes sensitive systems without strict scoping and redaction, it becomes much easier for a malicious or compromised prompt to convince the agent to leak more than you intended. You’re no longer asking “can this endpoint be called,” but “how much can this agent see and share once it’s inside.”
3. Far-reaching consequences
The cloud angle deserves its own call‑out. The pattern looks a lot like the OpenClaw stories that have been floating around: people wire agents into broad and important systems, assume “it’ll be fine,” and only realize the real blast radius after something painful happens, like losing an entire database or touching the wrong environment.
Each tool should do one narrow job with the minimum permissions needed, and it should never have more access than you’d be comfortable giving a junior engineer in their first week.
Designing safer MCP tools and permissions
TL;DR: Secure MCP tools with least-privilege permissions, safe defaults, and human approval for high-risk actions.
The good news is you don’t need brand-new techniques to make MCP safer; you just need to apply basic security discipline to a new surface.
Least privilege should be the default for every MCP tool. Each tool should do one narrow job with the minimum permissions needed, and it should never have more access than you’d be comfortable giving a junior engineer in their first week.
Safe defaults help a lot. New tools should start out read‑only wherever possible, and anything that changes state should live behind extra friction.
That might mean explicit approval steps, tighter rate limits, or a pattern where the agent proposes a plan and a human confirms before it runs.
You also want clear boundaries around the MCP server itself. Only trusted clients should be able to talk to it, the network exposure should be intentional, and the configuration should be simple enough that you can explain which tools exist and what they can touch on a whiteboard.
If you can’t answer that last question, you’re already guessing about your risk.
Monitoring and testing the MCP server security
TL;DR: Logging, alerts, and regular testing help detect misuse, prompt injection attempts, and unexpected tool behavior early.
Even with good design, you won’t know how safe your MCP setup really is until you watch it in the wild.
At a minimum, you want to log every MCP interaction: which client called the server, which tool ran, which parameters it received, and whether the call succeeded. That’s your black-box recorder when something weird happens.
On top of logging, add simple alerts for patterns that should never be ignored. Sudden spikes in calls to sensitive tools, repeated failures from unknown clients, or requests coming from unexpected locations are all good early‑warning signals.
You don’t need a full-blown SOC on day one, but you do need enough visibility to spot “this looks wrong” before it becomes a postmortem.
Testing also needs to evolve. It’s not enough to check that tools work in the happy path; you should have a safe environment where you deliberately try prompt injection, misuse tools, and see how the system behaves.
Treat MCP flows like any other critical integration: write tests, run them regularly, and make sure changes to prompts or tools don’t quietly widen your risk.
You don’t need a full-blown SOC on day one, but you do need enough visibility to spot “this looks wrong” before it becomes a postmortem.
Conclusion
TL;DR: MCP servers are powerful infrastructure components, so treat them like high-value APIs with strong security controls and visibility.
MCP servers look like a developer convenience feature, but in practice, they’re part of your core infrastructure.
They sit between your models and the systems that actually do work, and once you connect real tools behind them, you’ve effectively created a new control plane. If that control plane is misconfigured, it doesn’t really matter how good the rest of your security story is.
“AI agents don’t fit neatly into existing security models for humans and machines. They are hyper-scale, dynamic, and short-lived entities, yet they often hold powerful access to critical systems. If their privileges aren’t carefully monitored and appropriately constrained, organizations can be left exposed to escalation attacks, data breaches, and other security incidents.”
– CyberArk, What’s Shaping the AI Agent Security Market in 2026 (cyberark.com)
The pattern we’ve seen with things like OpenClaw and other autonomous setups is always the same: powerful integrations, optimistic assumptions, and very little visibility until something goes wrong.
For me, a reasonable rule of thumb is simple: if you wouldn’t expose an API directly without strong auth, scoping, and logging, don’t expose it indirectly through MCP either. Design tools with narrow permissions, and keep high‑risk actions behind human approval.
There’s an old security line that says “every new feature is a new attack surface.” MCP servers are exactly that: a new surface where powerful integrations meet autonomous behavior. If you treat them like toys, they’ll eventually bite you when it matters.
Want to improve your general security posture? See how Tricentis products can help your team.
This post was written by David Snatch. David is a cloud architect focused on implementing secure continuous delivery pipelines using Terraform, Kubernetes, and any other awesome tech that helps customers deliver results.
