The MCP Server as a Bounded Context

Most MCP servers in production today are the wrong size. They are too large. They started as one tool, grew to four, accreted to twelve, and at some point the project turned into a tools/ folder with thirty handlers covering payments, analytics, user admin, content, and a special section called misc/. The team senses something is wrong but the architectural language to articulate the wrongness is missing, so they live with it, and the server gets harder to reason about every quarter.

Domain-Driven Design has the language. Bounded contexts. A bounded context is a part of a system inside which a particular vocabulary holds — where "order" means one specific thing, "user" means one specific thing, "publish" maps to one specific behavior. Across context boundaries, those words stop meaning the same thing. The "user" in the auth context is not the same object as the "user" in the billing context. They share an ID; that is all.

Applied to MCP: one server per bounded context. Each server speaks one ubiquitous language. The tools inside it use the vocabulary of that context. Crossing into another context happens through a different server, not through more tools on the same server. The architectural pressure that comes from this single rule fixes most of the "MCP servers grow into god-services" failure mode without any further effort.

This post is the DDD case made concrete, anchored on the Node.js Clean Architecture template we recommend as a backend foundation. That template ships with six bounded contexts — auth, users, books, orders, payments, admin — and it makes the per-context MCP-server story easy to reason about because the contexts are already drawn.

Why this is more important for MCP than for backends

Bounded contexts are a useful idea everywhere. They are especially useful for MCP for two reasons that do not apply as strongly to a regular backend.

The agent's reasoning is bounded by its context window. A model looking at a tool catalog of forty tools spread across six unrelated domains is reasoning over a haystack. It will sometimes pick the wrong needle. A model looking at six tools all of which speak the same language about the same kind of object is reasoning over a small, coherent surface. Tool selection accuracy goes up. The size of the catalog is a quality metric, not just an engineering one.

Security is a tool-design problem. From the security post, the narrow-funnel principle says tools should be the smallest, most specific operations the team is comfortable letting the agent perform. Bounded contexts are the natural unit of "smallest specific" — a tool that operates on Book is reasoned about within the books context, not in some grand "all things in the system" framing. Cross-context capabilities are suspicious by default; the discipline of one server per context makes them visible the moment they appear.

These two together — better model reasoning, sharper security boundaries — are what make the rule pay off in MCP specifically. A 30-handler backend service is unwieldy but not dangerous. A 30-tool MCP server is both.

The shape, applied to a real template

The Clean Architecture template's six contexts:

auth — sign in, sign out, refresh, password reset, MFA enrollment.
users — profile, preferences, account settings.
books — catalog, search, metadata, reviews.
orders — cart, checkout, order history, fulfillment status.
payments — methods, charges, refunds, invoices.
admin — content moderation, user lifecycle, audit log queries.

A team approaching MCP in this template's shape might be tempted to ship one server: bookstore-mcp, with thirty tools, one per use case across all contexts. That is the wrong default.

The right default: six servers. One per context. Each one named after the context: bookstore-auth-mcp, bookstore-orders-mcp, etc. Each one exposes the tools that make sense within that context's ubiquitous language.

A representative subset:

`bookstore-orders-mcp`:

searchOrders({ status, dateRange, customerId }) — read.
getOrderDetail({ orderId }) — read.
cancelOrder({ orderId, reason }) — mutation.
requestRefund({ orderId, items }) — mutation.

`bookstore-payments-mcp`:

listPaymentMethods({ customerId }) — read.
chargeCustomer({ customerId, amount, currency }) — mutation, scoped tightly.
refundCharge({ chargeId, amount }) — mutation.
getInvoice({ invoiceId }) — read.

`bookstore-admin-mcp`:

listAuditLog({ userId, dateRange }) — read.
lockUserAccount({ userId, reason, duration }) — mutation.
releaseUserAccount({ userId }) — mutation.

Notice what is not on bookstore-orders-mcp: nothing about charging the customer, nothing about looking up the customer's payment methods. Those are payments-context concerns. If an order workflow needs them, the host wires both servers together — the agent has both in its catalog, the orders tool returns enough information for the payments tool to act on, and the cross-context behavior is at the agent level rather than the server level.

This is the single most important pattern in this post. Cross-context coordination lives in the agent, not in a tool. The moment you see yourself writing placeOrderAndCharge(...) as a single tool, you have collapsed two contexts into one — and you have done it precisely at the boundary where keeping them separate gives you the most leverage.

Ubiquitous language at the tool-name level

A second DDD instinct that pays off enormously in MCP: each bounded context has a ubiquitous language. The words used in conversation, in code, in tests, in user-facing copy, are all the same words and they all mean the same thing.

For an MCP server, the ubiquitous language ends up encoded in three places:

Tool names. cancelOrder, not removeOrderRecord. lockUserAccount, not disableUser.
Parameter names. orderId, not id. customerId, not userId (in the orders context, "customer" is the right word; the auth context calls them "user").
Tool descriptions. Use the same vocabulary in the description that the team uses in design docs and bug reports.

The discipline pays off when the agent gets the call right on the first try because the model is not playing translator. The model reads "cancelOrder", sees orderId as a parameter, has been told to act on an order — every word in the user's message that says "cancel my order" maps directly. No translation, no ambiguity, no fishing.

Where this gets uncomfortable: when the backend uses different words than the ubiquitous language of the context. The backend's database schema has tbl_orders.id and the ORM exposes it as Order.id, but the ubiquitous language says orderId. Resist the pull to expose id at the tool surface. The tool surface is part of the public-facing language of the system; it should match the language the team and the user use, not the language the database happens to use. The mapping is a small piece of code in the handler, and it is one of the cheapest investments in long-term clarity you can make.

Anti-corruption layers

The pattern that ties this together when an MCP server has to call into a system that does not speak the bounded context's language.

A worked example: bookstore-payments-mcp wraps Stripe. Stripe has its own vocabulary — PaymentIntent, Charge, Refund, Customer, PaymentMethod. Some of these match the payments context's vocabulary; some do not. Stripe's Customer is not the same object as the bookstore's Customer; Stripe's Charge carries a lot of metadata the bookstore does not care about; Stripe's idea of a refund is more granular than the payments context wants to expose.

The wrong move: pass Stripe's vocabulary through to the agent. chargePaymentIntent({ paymentIntentId }) is leaking Stripe's terms into the agent's context.

The right move: the anti-corruption layer. Inside the MCP server, between the tool handler and the Stripe SDK, sits a translator. The tool handler speaks the bounded context's language: chargeCustomer({ customerId, amount, currency }). The translator maps that to the right sequence of Stripe operations — find or create a PaymentIntent, confirm it, return a normalized result.

typescript

async function chargeCustomerHandler(args: {
  customerId: string;
  amount: number;
  currency: string;
}) {
  const stripe = stripeClient();

  const stripeCustomerId = await mapping.toStripeCustomer(args.customerId);
  const intent = await stripe.paymentIntents.create({
    amount: args.amount,
    currency: args.currency,
    customer: stripeCustomerId,
    confirm: true,
    payment_method: await mapping.defaultPaymentMethod(args.customerId),
  });

  return {
    chargeId: intent.latest_charge as string,
    status: mapping.toBoundedContextStatus(intent.status),
    amount: intent.amount,
  };
}

The agent sees the bounded context's vocabulary. The Stripe SDK lives entirely inside the handler. If Stripe deprecates PaymentIntent and replaces it with something else, the change is contained to the anti-corruption layer; the tool surface is unchanged.

This pattern matters more for MCP than for regular APIs because the agent is reasoning over the tool surface in real time. A change in vocabulary at the tool level changes the agent's behavior. Insulating the tool surface from upstream vocabulary changes is what lets you keep the agent's behavior stable while the underlying systems churn.

When one server is right after all

The "one server per context" rule is the right default. It is not the right rule for every project, and the cases where breaking it is correct are worth naming.

Solo or low-volume contexts. A context that has two tools and is unlikely to grow much may not justify its own server. Combining it with a closely-related context, with a clear naming separation, is fine. The cost of a separate server (deployment, auth wiring, observability) is real; do not pay it for a context that does not earn it.

Strongly-coupled contexts that share state. Some contexts are so intertwined that splitting them creates more pain than it solves. Auth and users sometimes look like this — the auth context controls login, the users context controls profile, but they share the same underlying user record. A single identity-mcp covering both is sometimes cleaner than auth-mcp and users-mcp with constant cross-calls. Use judgment; the rule is a default, not a law.

Throwaway internal tools. A one-off MCP server for a small team's internal use, with five tools across three nominal contexts, is fine as one server. The discipline matters when the server is meant to last; throwaway code should obey throwaway rules.

The general principle: split when the cost of not splitting starts to exceed the cost of splitting. For most production MCP servers, that point arrives sooner than teams expect.

What this means for hosts

A host that connects to six MCP servers behaves differently than one connected to a single mega-server. Two effects worth being aware of:

The agent sees a larger total catalog, but a more coherent one. Six servers exposing five tools each is thirty tools total — same number as one server with thirty tools. But the model receives them grouped, named in distinct vocabularies, and tagged with their server of origin. Model performance on "which tool to call" goes up because the structure of the catalog mirrors the structure of the domain.

Auth becomes per-server. Each server has its own protocol-version negotiation, its own auth, its own session. For stdio servers, this is barely a concern; for HTTP servers, the host either juggles multiple OAuth flows or routes them all through one IdP, where each server is a separate resource server. The OAuth post covers the IdP-as-authorization-server pattern that makes this manageable; the transport post covers when each transport is right.

A host with thoughtful UX will surface the bounded contexts to the user — "you have access to Orders, Payments, and Admin tools; click to expand." The user understands the structure of what the agent can do. The agent reasons over a more navigable space. Both improvements come from the same architectural choice.

How to draw the boundaries

For a team approaching this fresh, the question becomes: how do I find my bounded contexts?

The same way DDD has answered this for two decades. A few practical heuristics that translate well to MCP:

Listen to how the team talks. What words come up in conversation about the system? Where does the same word mean different things to different team members? Each "different meaning" is a context boundary.

Look at the existing data model. Tables that share many foreign keys belong in the same context. Tables that hand off through a single ID and otherwise do not relate are likely separate contexts.

Map the authorization model. If the same set of permissions covers a group of operations, that group is probably a context. If permissions diverge — admins can do one set of things, users can do another — the operations split along permission lines.

Watch where coordination is happening. If two parts of the system have to be kept in sync through events, sagas, or compensating transactions, those parts are separate contexts that need to communicate, not one context with a complicated transaction. The complexity of the coordination is the signature of the boundary.

For the bookstore template, the boundaries are visible from the folder structure — the team that built it did this work upfront. For a less-organized backend, the work is real but worth doing once. The MCP-server-per-context layout is a forcing function; the boundaries it requires are the ones the rest of the system was already going to need.

The smell test

A short list of signs your server has stopped being a single bounded context:

A `misc` or `utils` folder of tools. Almost always a sign that tools belonging to different contexts have been collected because nobody knew where else to put them.
Tool names that share prefixes from different domains. userPreferences_get, orderHistory_get, paymentMethod_list. The prefixes are doing the work of context boundaries; the prefixes should be servers.
Schemas that reference IDs from multiple domains in a single tool. cancelOrderAndIssueRefundAndNotifyUser({ orderId, userId, paymentMethodId }). Cross-context coordination has leaked into a tool.
Tool descriptions that explain "this is a thin wrapper around our internal X service." The wrapper is the anti-corruption layer; if you have to explain it in the description, the layer is leaking through.

These are not catastrophes. Each one is a signal the architecture has drifted. Rename, split, refactor — the same techniques you would use on a backend service.

Where this fits

This post sits next to the security post and the pillar as a third architectural reference. The versioning post becomes more tractable when the unit of versioning is a per-context server with five tools, not a god-server with thirty. The testing post follows naturally — each server has its own test suite, scoped to its context's vocabulary and operations.

For multi-tenant deployments, the multi-tenant checklist interacts with bounded contexts: a single tenant might have access to a subset of contexts (some servers, not others), and that subset is the right unit for tenant scoping.

One server per bounded context is the kind of rule that costs an hour of architectural conversation upfront and saves a year of "the server has gotten weird" conversations later. The instincts that keep a backend monolith healthy keep an MCP tool surface healthy. The same vocabulary, the same boundaries, the same anti-corruption layers — applied at the tool level, where the agent is reasoning over the surface in real time.