Model Context Protocol in Production: Server Boundaries, Tool Governance, and Transport Choices

Introduction
Decide server boundaries first
Tools and resources should not be interchangeable
Transport choice: stdio versus Streamable HTTP
- When stdio is a strong fit
- When Streamable HTTP is a better fit
Design tools for model clarity, not for internal convenience
Token budgeting matters more than most teams expect
Authentication and permission boundaries
Observability and operational review
- Pre-production checklist
Anti-patterns
Closing thoughts
References

Model Context Protocol in Production

Introduction

Model Context Protocol, or MCP, is quickly becoming a practical standard for connecting AI applications to tools and contextual data. The real production question is not whether MCP is elegant. It is whether the way you package servers, tools, resources, and transports will remain safe, understandable, and cheap once real users and real systems are involved.

The hard questions appear early.

How much should one MCP server own
When should something be a tool versus a resource
When is stdio enough and when do you need HTTP
How do you stop tool descriptions and oversized payloads from wasting context
How do you audit who invoked what and why

This guide stays close to the official MCP documentation and focuses on operating concerns rather than toy examples.

Decide server boundaries first

One of the most common mistakes is building a single giant MCP server that exposes every internal API. That usually creates four problems at once.

Tool descriptions become long and repetitive
Security boundaries blur
Failures become harder to isolate
Versioning becomes coupled across unrelated domains

A better model is to split servers along domain or trust boundaries.

Example boundary	What it owns	Why it should be isolated
`github-governance`	PR lookup, ruleset status, required checks	Permissions and workflows are specific to source control governance
`incident-ops`	alert summaries, runbook links, incident notes	Audit and reliability needs differ from developer tooling
`docs-search`	read-only documentation resources	Search-heavy, read-only usage patterns benefit from simpler controls

The best MCP platforms feel smaller than the systems behind them because they expose only the right surface.

Tools and resources should not be interchangeable

The protocol distinguishes tools from resources for a reason.

tool
- best for actions, transformations, or structured lookups that feel like function calls
resource
- best for readable context such as documentation, inventories, or runbooks

If everything becomes a tool, the model over-calls interfaces just to read context. If everything becomes a resource, you lose clear action boundaries and make approval and auditing harder.

In practice:

use tools for deploy, create, update, validate, or targeted lookup flows
use resources for policies, docs, service maps, and runbooks

That separation improves both reliability and governance.

Transport choice: stdio versus Streamable HTTP

Transport is not a minor implementation detail. It changes how you operate the server.

When stdio is a strong fit

local desktop clients
developer tooling
single-user flows
environments where the client can own the server process lifecycle

Strengths:

simple to implement
ideal for local integrations
fast to iterate on during development

Limits:

weak central observability
awkward for multi-tenant operation
lifecycle depends on the client process model

When Streamable HTTP is a better fit

shared services used by multiple clients
centrally managed authentication and policy
platform teams that need repeatable deployment, logging, and network controls

Strengths:

fits existing service operations patterns
works cleanly with gateways, ingress, and service mesh policies
easier to scale across client types

Limits:

requires more explicit auth and trust-boundary design
can become a thin wrapper over internal APIs if interface design is not intentional

Design tools for model clarity, not for internal convenience

Choose narrow, explicit names

Poor tool names make the model guess.

run_action
manage_item
operate_system

Better names make the outcome predictable.

list_pull_requests
get_ruleset_status
create_incident_note

Tool names are part of the prompt surface. Ambiguity increases selection errors.

Keep schemas strict and short

Every additional free-form field creates another chance for invalid calls. Practical rules:

minimize required inputs
prefer enums over unconstrained text where possible
keep descriptions short and action-oriented
avoid mixing human-facing labels with internal identifiers

Return the next useful shape, not a raw dump

Models do better with structured results that support the next decision. A result like this is more useful than a giant unfiltered JSON payload.

{
  "repository": "platform/api",
  "ruleset_status": "blocking",
  "missing_checks": ["lint", "integration-test"],
  "next_actions": [
    "wait_for_required_checks",
    "request_codeowner_review"
  ]
}

In production systems, returning the smallest useful summary usually reduces repeated calls and wasted context.

Token budgeting matters more than most teams expect

One of the most practical lessons from recent MCP tooling discussions is that context waste often matters more than raw model quality.

Common waste patterns:

duplicated tool descriptions
entire documents returned when only a summary is needed
large low-value fields included in every response
one server exposing too many tools

Operational rules that help:

Keep tool descriptions short.
Split large resources into retrievable chunks.
Use summary-first responses and detail follow-up calls.
Remove low-use tools based on real invocation data.

If your runbook server returns the full document body for every request, the system will feel slow and expensive even when the protocol is working correctly.

Authentication and permission boundaries

The moment MCP touches internal systems, convenience stops being the main concern. Authorization does.

Questions you should answer before rollout:

Is the caller acting as a user delegate or as a service identity
Are read-only and mutating tools isolated
Should sensitive resources live on a separate server
What minimum audit record must exist for every invocation

Recommended operating pattern:

split read-only and mutating servers
require idempotency or approval for write-capable tools
log user, session, tool name, input summary, and result status
keep failures readable enough for human review

Observability and operational review

Treat MCP servers like production services, not glue code.

Area	Metrics to watch	Why it matters
Availability	success rate, latency	Helps you see when clients stop trusting the server
Quality	retry rate after tool calls	Surfaces confusing tools or weak schemas
Cost	response size, estimated token load	Highlights context waste
Security	denied calls, permission failures, unusual usage patterns	Detects abuse and policy gaps

Pre-production checklist

Are server boundaries aligned with domain and trust boundaries
Are tool names and schemas narrow and explicit
Do resources avoid oversized payloads
Are audit logs sufficient
Is fallback behavior defined for server failure
Is the transport decision documented

Anti-patterns

"We can just wrap every internal API with MCP"

That usually produces a protocol-shaped proxy, not a model-usable interface. MCP works best when the surface is intentionally designed for model reasoning.

"More tools means a smarter agent"

Usually the opposite happens. More tools create more selection ambiguity and more context overhead. Smaller, clearer interfaces outperform large noisy ones.

"If it works locally, it is ready for production"

A local stdio integration can feel great and still fail the moment central auth, multi-client access, deployment, and observability requirements arrive. Production MCP work is interface design plus platform operations.

Closing thoughts

Teams that deploy MCP well usually do not start with many tools. They start with clear boundaries, narrow schemas, and a deliberate transport model. The protocol is helpful, but the lasting success comes from governance.

If you want a durable starting point, prioritize these two decisions first.

split servers by domain and trust boundary
design short, explicit tools and resources

Most of the later wins in security, observability, and cost control get easier once those two are right.