Skip to content
Published on

Model Context Protocol in Production: Server Boundaries, Tool Governance, and Transport Choices

Authors

Model Context Protocol in Production

Introduction

Model Context Protocol, or MCP, is quickly becoming a practical standard for connecting AI applications to tools and contextual data. The real production question is not whether MCP is elegant. It is whether the way you package servers, tools, resources, and transports will remain safe, understandable, and cheap once real users and real systems are involved.

The hard questions appear early.

  • How much should one MCP server own
  • When should something be a tool versus a resource
  • When is stdio enough and when do you need HTTP
  • How do you stop tool descriptions and oversized payloads from wasting context
  • How do you audit who invoked what and why

This guide stays close to the official MCP documentation and focuses on operating concerns rather than toy examples.

Decide server boundaries first

One of the most common mistakes is building a single giant MCP server that exposes every internal API. That usually creates four problems at once.

  • Tool descriptions become long and repetitive
  • Security boundaries blur
  • Failures become harder to isolate
  • Versioning becomes coupled across unrelated domains

A better model is to split servers along domain or trust boundaries.

Example boundaryWhat it ownsWhy it should be isolated
github-governancePR lookup, ruleset status, required checksPermissions and workflows are specific to source control governance
incident-opsalert summaries, runbook links, incident notesAudit and reliability needs differ from developer tooling
docs-searchread-only documentation resourcesSearch-heavy, read-only usage patterns benefit from simpler controls

The best MCP platforms feel smaller than the systems behind them because they expose only the right surface.

Tools and resources should not be interchangeable

The protocol distinguishes tools from resources for a reason.

  • tool
    • best for actions, transformations, or structured lookups that feel like function calls
  • resource
    • best for readable context such as documentation, inventories, or runbooks

If everything becomes a tool, the model over-calls interfaces just to read context. If everything becomes a resource, you lose clear action boundaries and make approval and auditing harder.

In practice:

  • use tools for deploy, create, update, validate, or targeted lookup flows
  • use resources for policies, docs, service maps, and runbooks

That separation improves both reliability and governance.

Transport choice: stdio versus Streamable HTTP

Transport is not a minor implementation detail. It changes how you operate the server.

When stdio is a strong fit

  • local desktop clients
  • developer tooling
  • single-user flows
  • environments where the client can own the server process lifecycle

Strengths:

  • simple to implement
  • ideal for local integrations
  • fast to iterate on during development

Limits:

  • weak central observability
  • awkward for multi-tenant operation
  • lifecycle depends on the client process model

When Streamable HTTP is a better fit

  • shared services used by multiple clients
  • centrally managed authentication and policy
  • platform teams that need repeatable deployment, logging, and network controls

Strengths:

  • fits existing service operations patterns
  • works cleanly with gateways, ingress, and service mesh policies
  • easier to scale across client types

Limits:

  • requires more explicit auth and trust-boundary design
  • can become a thin wrapper over internal APIs if interface design is not intentional

Design tools for model clarity, not for internal convenience

Choose narrow, explicit names

Poor tool names make the model guess.

run_action
manage_item
operate_system

Better names make the outcome predictable.

list_pull_requests
get_ruleset_status
create_incident_note

Tool names are part of the prompt surface. Ambiguity increases selection errors.

Keep schemas strict and short

Every additional free-form field creates another chance for invalid calls. Practical rules:

  • minimize required inputs
  • prefer enums over unconstrained text where possible
  • keep descriptions short and action-oriented
  • avoid mixing human-facing labels with internal identifiers

Return the next useful shape, not a raw dump

Models do better with structured results that support the next decision. A result like this is more useful than a giant unfiltered JSON payload.

{
  "repository": "platform/api",
  "ruleset_status": "blocking",
  "missing_checks": ["lint", "integration-test"],
  "next_actions": [
    "wait_for_required_checks",
    "request_codeowner_review"
  ]
}

In production systems, returning the smallest useful summary usually reduces repeated calls and wasted context.

Token budgeting matters more than most teams expect

One of the most practical lessons from recent MCP tooling discussions is that context waste often matters more than raw model quality.

Common waste patterns:

  • duplicated tool descriptions
  • entire documents returned when only a summary is needed
  • large low-value fields included in every response
  • one server exposing too many tools

Operational rules that help:

  1. Keep tool descriptions short.
  2. Split large resources into retrievable chunks.
  3. Use summary-first responses and detail follow-up calls.
  4. Remove low-use tools based on real invocation data.

If your runbook server returns the full document body for every request, the system will feel slow and expensive even when the protocol is working correctly.

Authentication and permission boundaries

The moment MCP touches internal systems, convenience stops being the main concern. Authorization does.

Questions you should answer before rollout:

  • Is the caller acting as a user delegate or as a service identity
  • Are read-only and mutating tools isolated
  • Should sensitive resources live on a separate server
  • What minimum audit record must exist for every invocation

Recommended operating pattern:

  • split read-only and mutating servers
  • require idempotency or approval for write-capable tools
  • log user, session, tool name, input summary, and result status
  • keep failures readable enough for human review

Observability and operational review

Treat MCP servers like production services, not glue code.

AreaMetrics to watchWhy it matters
Availabilitysuccess rate, latencyHelps you see when clients stop trusting the server
Qualityretry rate after tool callsSurfaces confusing tools or weak schemas
Costresponse size, estimated token loadHighlights context waste
Securitydenied calls, permission failures, unusual usage patternsDetects abuse and policy gaps

Pre-production checklist

  • Are server boundaries aligned with domain and trust boundaries
  • Are tool names and schemas narrow and explicit
  • Do resources avoid oversized payloads
  • Are audit logs sufficient
  • Is fallback behavior defined for server failure
  • Is the transport decision documented

Anti-patterns

"We can just wrap every internal API with MCP"

That usually produces a protocol-shaped proxy, not a model-usable interface. MCP works best when the surface is intentionally designed for model reasoning.

"More tools means a smarter agent"

Usually the opposite happens. More tools create more selection ambiguity and more context overhead. Smaller, clearer interfaces outperform large noisy ones.

"If it works locally, it is ready for production"

A local stdio integration can feel great and still fail the moment central auth, multi-client access, deployment, and observability requirements arrive. Production MCP work is interface design plus platform operations.

Closing thoughts

Teams that deploy MCP well usually do not start with many tools. They start with clear boundaries, narrow schemas, and a deliberate transport model. The protocol is helpful, but the lasting success comes from governance.

If you want a durable starting point, prioritize these two decisions first.

  • split servers by domain and trust boundary
  • design short, explicit tools and resources

Most of the later wins in security, observability, and cost control get easier once those two are right.

References