· Tim Quinteiro

MCP observability vs APM: what Datadog, Sentry, and New Relic miss

Your APM sees HTTP and infrastructure. It does not see the MCP protocol. Here is exactly what falls through the gap, and why you need both.

  • mcp
  • observability
  • apm

If you already run Datadog, Sentry, or New Relic, you might reasonably ask why your MCP server needs anything else. You instrument everything else with your APM. Why not this?

The short answer: your APM sees HTTP and infrastructure. It does not see the MCP protocol. This post is about exactly what falls through that gap, and why the answer is “both,” not “either.”

What your APM sees

A general-purpose APM is excellent at what it was built for. It sees:

  • HTTP requests and responses, status codes, and route-level latency.
  • Infrastructure: CPU, memory, container health, database queries.
  • Stack traces when your process throws.
  • Distributed traces across your services, stitched by trace context.

For an MCP server, that means your APM can tell you the POST /mcp endpoint returned 200 in 180ms. It cannot tell you what happened inside.

What it misses

An MCP exchange is a JSON-RPC message: a tool call, a prompt fetch, a resource read. One HTTP request can carry a tool call that failed at the protocol level while the HTTP layer reports a clean 200. That is the core of the problem. Here is what your APM cannot answer:

Which tool was called, with what arguments. To the APM it is an opaque request body. To you, the tool name and arguments are the whole story.

Whether the tool call actually succeeded. MCP errors live inside the JSON-RPC response. A tool can return an error object inside a 200 OK. Your APM counts that as a success. Your customer’s agent counts it as a failure.

Which client made the call. Claude Desktop, Cursor, Codex, Windsurf, or some agent you have never seen. Client identity is the dimension that explains most regressions, and it is buried in a header your APM does not break out.

Session continuity. MCP work happens across a session. APM traces are per-request. Reconstructing “what was this agent trying to do” from per-request HTTP spans is painful.

Per-tool performance. Your APM gives you route-level latency. But POST /mcp is one route carrying twenty different tools with wildly different performance profiles. The average is a lie.

A worked example

A customer reports that “the agent keeps failing.” You open your APM. The /mcp endpoint shows a 99.8% success rate and a healthy p95. Nothing looks wrong. You close the ticket as “cannot reproduce.”

What actually happened: one tool, search_orders, returns a JSON-RPC error for any query containing a date range, because of a parsing bug. That is a 200 OK at the HTTP layer every single time. The error is in the response body. To your APM it is invisible. To the agent calling it, every date-range search fails.

With protocol-level monitoring, this is a thirty-second find: filter to search_orders, sort by error, see that every failure carries a date-range argument. Same data, completely different debugging experience, because the unit of observation is the tool call, not the HTTP request.

Why not just add custom spans?

You can. You can manually instrument your MCP handlers with custom spans and attributes in your APM. Teams do it. Two things tend to happen.

First, it is a lot of bespoke work, and it drifts. Every new tool needs new instrumentation, and the moment someone forgets, you have a blind spot exactly where you will eventually need to look.

Second, the MCP model does not map cleanly onto a span tree. An MCP request/response is one JSON-RPC exchange: one node, not a tree of spans. When you force it into a span hierarchy, the mismatch leaks into every query you write. We learned this building Spanly, and it is why the product models MCP natively instead of dressing it up as something it is not.

The answer is both

This is not a replacement pitch. Keep your APM. It owns HTTP, infrastructure, and your wider service, and it does that well. Add a layer that understands the MCP protocol on top.

Spanly is additive by design. Continue sending HTTP and infrastructure telemetry to Datadog, Sentry, or New Relic, and send MCP-shaped traffic to Spanly. Because the SDK propagates the W3C trace context on inbound MCP requests, the same exchange links across both systems: you can jump from a Spanly tool call straight to the matching trace in your APM. The SDK can also export traces back to your APM over OpenTelemetry, so the MCP span shows up in both places.

Most of our customers run both side by side. The APM answers “is the service healthy.” Spanly answers “is the protocol healthy,” and those are genuinely different questions.

Keep reading

Tim