- Published on
Model Context Protocol in Production: Server Boundaries, Tool Governance, and Transport Choices
- Authors

- Name
- Youngju Kim
- @fjvbn20031
- Introduction
- Decide server boundaries first
- Tools and resources should not be interchangeable
- Transport choice: stdio versus Streamable HTTP
- Design tools for model clarity, not for internal convenience
- Token budgeting matters more than most teams expect
- Authentication and permission boundaries
- Observability and operational review
- Anti-patterns
- Closing thoughts
- References

Introduction
Model Context Protocol, or MCP, is quickly becoming a practical standard for connecting AI applications to tools and contextual data. The real production question is not whether MCP is elegant. It is whether the way you package servers, tools, resources, and transports will remain safe, understandable, and cheap once real users and real systems are involved.
The hard questions appear early.
- How much should one MCP server own
- When should something be a tool versus a resource
- When is stdio enough and when do you need HTTP
- How do you stop tool descriptions and oversized payloads from wasting context
- How do you audit who invoked what and why
This guide stays close to the official MCP documentation and focuses on operating concerns rather than toy examples.
Decide server boundaries first
One of the most common mistakes is building a single giant MCP server that exposes every internal API. That usually creates four problems at once.
- Tool descriptions become long and repetitive
- Security boundaries blur
- Failures become harder to isolate
- Versioning becomes coupled across unrelated domains
A better model is to split servers along domain or trust boundaries.
| Example boundary | What it owns | Why it should be isolated |
|---|---|---|
github-governance | PR lookup, ruleset status, required checks | Permissions and workflows are specific to source control governance |
incident-ops | alert summaries, runbook links, incident notes | Audit and reliability needs differ from developer tooling |
docs-search | read-only documentation resources | Search-heavy, read-only usage patterns benefit from simpler controls |
The best MCP platforms feel smaller than the systems behind them because they expose only the right surface.
Tools and resources should not be interchangeable
The protocol distinguishes tools from resources for a reason.
tool- best for actions, transformations, or structured lookups that feel like function calls
resource- best for readable context such as documentation, inventories, or runbooks
If everything becomes a tool, the model over-calls interfaces just to read context. If everything becomes a resource, you lose clear action boundaries and make approval and auditing harder.
In practice:
- use tools for deploy, create, update, validate, or targeted lookup flows
- use resources for policies, docs, service maps, and runbooks
That separation improves both reliability and governance.
Transport choice: stdio versus Streamable HTTP
Transport is not a minor implementation detail. It changes how you operate the server.
When stdio is a strong fit
- local desktop clients
- developer tooling
- single-user flows
- environments where the client can own the server process lifecycle
Strengths:
- simple to implement
- ideal for local integrations
- fast to iterate on during development
Limits:
- weak central observability
- awkward for multi-tenant operation
- lifecycle depends on the client process model
When Streamable HTTP is a better fit
- shared services used by multiple clients
- centrally managed authentication and policy
- platform teams that need repeatable deployment, logging, and network controls
Strengths:
- fits existing service operations patterns
- works cleanly with gateways, ingress, and service mesh policies
- easier to scale across client types
Limits:
- requires more explicit auth and trust-boundary design
- can become a thin wrapper over internal APIs if interface design is not intentional
Design tools for model clarity, not for internal convenience
Choose narrow, explicit names
Poor tool names make the model guess.
run_action
manage_item
operate_system
Better names make the outcome predictable.
list_pull_requests
get_ruleset_status
create_incident_note
Tool names are part of the prompt surface. Ambiguity increases selection errors.
Keep schemas strict and short
Every additional free-form field creates another chance for invalid calls. Practical rules:
- minimize required inputs
- prefer enums over unconstrained text where possible
- keep descriptions short and action-oriented
- avoid mixing human-facing labels with internal identifiers
Return the next useful shape, not a raw dump
Models do better with structured results that support the next decision. A result like this is more useful than a giant unfiltered JSON payload.
{
"repository": "platform/api",
"ruleset_status": "blocking",
"missing_checks": ["lint", "integration-test"],
"next_actions": [
"wait_for_required_checks",
"request_codeowner_review"
]
}
In production systems, returning the smallest useful summary usually reduces repeated calls and wasted context.
Token budgeting matters more than most teams expect
One of the most practical lessons from recent MCP tooling discussions is that context waste often matters more than raw model quality.
Common waste patterns:
- duplicated tool descriptions
- entire documents returned when only a summary is needed
- large low-value fields included in every response
- one server exposing too many tools
Operational rules that help:
- Keep tool descriptions short.
- Split large resources into retrievable chunks.
- Use summary-first responses and detail follow-up calls.
- Remove low-use tools based on real invocation data.
If your runbook server returns the full document body for every request, the system will feel slow and expensive even when the protocol is working correctly.
Authentication and permission boundaries
The moment MCP touches internal systems, convenience stops being the main concern. Authorization does.
Questions you should answer before rollout:
- Is the caller acting as a user delegate or as a service identity
- Are read-only and mutating tools isolated
- Should sensitive resources live on a separate server
- What minimum audit record must exist for every invocation
Recommended operating pattern:
- split read-only and mutating servers
- require idempotency or approval for write-capable tools
- log user, session, tool name, input summary, and result status
- keep failures readable enough for human review
Observability and operational review
Treat MCP servers like production services, not glue code.
| Area | Metrics to watch | Why it matters |
|---|---|---|
| Availability | success rate, latency | Helps you see when clients stop trusting the server |
| Quality | retry rate after tool calls | Surfaces confusing tools or weak schemas |
| Cost | response size, estimated token load | Highlights context waste |
| Security | denied calls, permission failures, unusual usage patterns | Detects abuse and policy gaps |
Pre-production checklist
- Are server boundaries aligned with domain and trust boundaries
- Are tool names and schemas narrow and explicit
- Do resources avoid oversized payloads
- Are audit logs sufficient
- Is fallback behavior defined for server failure
- Is the transport decision documented
Anti-patterns
"We can just wrap every internal API with MCP"
That usually produces a protocol-shaped proxy, not a model-usable interface. MCP works best when the surface is intentionally designed for model reasoning.
"More tools means a smarter agent"
Usually the opposite happens. More tools create more selection ambiguity and more context overhead. Smaller, clearer interfaces outperform large noisy ones.
"If it works locally, it is ready for production"
A local stdio integration can feel great and still fail the moment central auth, multi-client access, deployment, and observability requirements arrive. Production MCP work is interface design plus platform operations.
Closing thoughts
Teams that deploy MCP well usually do not start with many tools. They start with clear boundaries, narrow schemas, and a deliberate transport model. The protocol is helpful, but the lasting success comes from governance.
If you want a durable starting point, prioritize these two decisions first.
- split servers by domain and trust boundary
- design short, explicit tools and resources
Most of the later wins in security, observability, and cost control get easier once those two are right.