NewIntroducing Spanly

Observability for MCP servers

The dedicated layer for MCP traffic, alongside the Datadog, Sentry, or New Relic you already run. Built for engineering teams shipping MCP servers in production.

Works with your existing APM via OpenTelemetry · Open-source SDK & CLI

< 1ms

SDK overhead per request.

99.9%

Uptime SLA for data ingestion.

2 regions

US and EU, with full data residency.

OTel

Export to any OpenTelemetry APM via the CLI.

Open source

SDK and CLI source on GitHub, Apache 2.0.

Drop in the SDK. Send MCP traces to Spanly, your existing APM, or both.

WHY A DEDICATED LAYER

Your APM sees HTTP. Spanly sees MCP.

Datadog, Sentry, and New Relic do an excellent job on requests, traces, and infrastructure. They don't model the MCP protocol. Spanly does, so an on-call engineer can read a failing tool call without piecing it together from raw spans.

Protocol
HTTP POST /mcp
Spanly
tools/call · resources/read · prompts/get
Errors
500 with stack trace
Spanly
Tool-call rejection with the full prompt and arguments
Performance
p95 endpoint latency
Spanly
p95 per tool, per server name, per client
Payload
Bytes in / bytes out
Spanly
Tokens in / tokens out, resource read sizes
Clients
One client identity per session
Spanly
Claude · Cursor · Windsurf · your custom client

We complement your existing stack. We don't replace it. Spanly's SDK is additive. Keep sending HTTP and infra telemetry to your APM, and send MCP-shaped traffic here.

Everything you need to monitor MCP

From real-time tracing to error tracking, Spanly captures every MCP request your server handles, from tools/call to resource reads.

initialize3021ms2ms15ms
tools/call1041ms6ms8ms
resources/read621ms1ms1ms

Real-time Traces

Visualize every request and response between clients and servers with detailed timing breakdowns. Drill into a single MCP call to see the full payload, the server that handled it, and where the milliseconds went.

Error Rate
0.42%
−12%
Affected Servers
3
+1
Error CodeCountLast Occurred
Invalid Params242m ago
Internal Error1114m ago
Request Timeout71h ago
Method Not Found53h ago
Connection Closed38h ago

Error Tracking

Catch errors before your users do. Track error codes, frequencies, and stack traces across every server and client – grouped so a recurring failure shows up once, not a thousand times.

Slowest operations
p50p95p99
tools/call: web_search
58380890
tools/call: code_exec
210480720
resources/read: /docs/api
76280540
prompts/get: search
22210320
tools/call: file_read
82140180
initialize
184265

Performance Metrics

Monitor request durations, identify bottlenecks, and track performance trends over time. p50, p95, p99 broken down by server and tool, so you can tell a single slow tool apart from a regressing deployment.

Cursor
14.1k+9%
Error rate0.18%
Codex CLI
9.7k+14%
Error rate0.21%
Claude Code
18.2k+28%
Error rate0.09%
Windsurf
6.3k+41%
Error rate0.27%
Copilot
11.5k3%
Error rate0.14%
Cline
4.8k+22%
Error rate0.33%
Zed
3.2k+52%
Error rate0.41%
Continue
2.1k+7%
Error rate0.19%

Analytics

Aggregate usage across servers and clients to spot trends and tune performance. See which prompts, tools, and resources are actually getting used – and which ones aren't earning their keep.

DATA RESIDENCY

Your data, in your region

Pick US or EU when you create a project. Telemetry stays in the region you chose. GDPR-friendly by design, with full data residency.

2 regions
US and EU ingest endpoints. Pick one per project; data never leaves it.
1 jurisdiction
EU customers keep all telemetry inside the EU. GDPR-ready, no US-side replication.
0 cross-region copies
Daily backups live in the same region as your live data. Never replicated elsewhere.
Operate at scale

Built for the people on call

Live status for the team room, alerts before customers complain, session traces for fast root-cause, and an audit log of everything that happened.

Status board

Built for the office TV

Pin always-on status boards to any screen. Full-screen, auto-refresh, no chrome.

Auto-refresh · 30sLive
Uptime
99.97%
p95
142ms
Errors
18
Open incidents
2
Alerts

Alerts, before users notice

Set thresholds on error rate, p95/p99, or traffic, scoped to any server, client, or tool. Routed to email, Slack, and signed webhooks.

Triggered just now
p99 latency exceeded threshold on mcp-prod-east
p99 > 500ms for 5mjust now
Error rate spiked on tool code_exec
errors > 2% for 10m6m ago
Traffic dropped 40% on client Cursor
volume drop > 30% for 15m24m ago
Sessions

Trace every session, end to end

Group requests by session to see the full arc of an interaction: every tool call, every error, every retry.

Real-time
  • sess_72c1ed40· started 12m ago12m 04s
    40 req2 err
  • sess_8a39f10c· started 3m ago3m 22s
    18 req0 err
  • sess_5d20b4ee· ended 2h ago41m 18s
    112 req5 err
  • sess_19b7cc28· ended 5h ago1m 02s
    7 req0 err
Audit log

A receipt for every action

Tamper-evident trail of every MCP call and config change.

  • 12:42:08AUTHuser@acme.com
    signed in via SSO
  • 12:41:55WRITEuser@acme.com
    changed alert threshold on mcp-prod-east
  • 12:39:21ADMINuser@acme.com
    revoked API key k_8f3a…
  • 12:38:02READagent:claude-code
    accessed session sess_72c1ed40

Live in 5 minutes

Drop the SDK into your MCP server. No schema to maintain, no agent process, no rebuild.

For your server

Instrument your MCP server

Drop the SDK into your TypeScript or Python server, wrap any binary with the CLI, or ship Spanly as a Docker sidecar. Every prompt, tool call, and resource gets traced – no schema, no agent process, no rebuild.

Install
$ npm install @spanly/sdk
Usespanly-setup.ts
import { SpanlyClient } from "@spanly/sdk"

const spanly = new SpanlyClient({
  apiKey: process.env.SPANLY_API_KEY,
})

spanly.monitor(mcpServer)
Also

Query Spanly from your editor

Spanly also ships an MCP server of its own. Point Claude Code, Cursor, or any MCP client at it to search traces and triage errors without leaving your IDE.

$ claude mcp add --transport http spanly https://mcp.spanly.com/mcp
Spanly

Start monitoring your MCP servers

Engineering teams: add the MCP layer to your observability stack in under 5 minutes.