osmTalk Docs
Calls

Analytics

Per-agent cost, latency, and volume metrics — sourced from the same numbers we bill with.

osmTalk surfaces analytics at three levels:

  1. Dashboard — fleet-wide call volume, status mix, peak hours
  2. Per-agent — cost breakdown, latency breakdown, daily volume for a single agent (/agents/[id])
  3. Per-call — token usage, latency-to-first-audio breakdown, cost components (/calls/[id])

Per-agent cost breakdown

On the agent detail page, the "Where the money went" card shows the cumulative spend split across:

  • LLM — token cost from the model provider (OpenAI / Anthropic / Groq)
  • STT — per-minute speech-to-text (Deepgram / Sarvam / ElevenLabs / Groq Whisper)
  • TTS — per-character text-to-speech (Deepgram Aura / ElevenLabs / Groq Orpheus)
  • SIP — per-minute Plivo telephony egress (phone calls only)

Numbers come directly from calls.costLlm / costStt / costTts / costSip / costTotal — the same columns the billing flow writes after each call. They will always match your invoice.

The Total Spend figure includes our platform margin (already applied). You don't pay it twice.

Per-agent latency breakdown

The "Average Latency" grid shows four numbers averaged across the agent's calls:

MetricWhat it measures
End-to-endUser finished speaking → first audio frame from the bot
STT TTFBFirst audio chunk → first speech-to-text token
LLM TTFBLLM prompt sent → first token in response
TTS TTFBFinal LLM response → first audio sample from TTS

The three TTFBs add up to roughly the end-to-end number (plus a few ms of internal routing).

If end-to-end is high (>1500 ms):

  • STT dominant → switch to a faster STT model (Groq Whisper Turbo: $0.04/hr, ~80ms)
  • LLM dominant → switch from gpt-5.4 to gpt-5.4-mini (~3x faster) or Groq Llama
  • TTS dominant → switch to Deepgram Aura ($0.030/1K) or ElevenLabs Flash (~150ms first byte)

Per-call latency breakdown

The "Time-to-first-audio breakdown" stacked bar on the call detail page shows the same three TTFB segments for that specific call. Useful when a single call felt sluggish — you can pinpoint which service was the bottleneck on that turn without diving into raw event logs.

Cost trend

agentStats.dailyVolume[] includes a cost field per day, so the volume chart can render a cost-over-time overlay. Use this to spot:

  • A talkative customer that drove up output tokens for that agent
  • A change in system prompt that 3x'd LLM costs
  • A switch from Deepgram TTS to ElevenLabs that doubled per-call cost

Programmatic access

const stats = await api.getAgentStats(agentId);
// stats.cost = { total, llm, stt, tts, sip, avgPerCall, currency: "INR" }
// stats.tokens = { prompt, completion }
// stats.latency = { avgTotal, avgLlmTtfb, avgTtsTtfb, avgSttTtfb }  (seconds)
// stats.dailyVolume = [ { date, calls, completed, failed, avgDuration, cost } ]
stats = osm._request("GET", f"/api/agents/{agent_id}/stats")

Why this matters

Voice-AI pricing in 2026 ranges from $0.05 to $0.40 per minute depending on the model stack (per Vapi pricing teardowns). Without a per-agent / per-call breakdown you can't tell whether a specific agent's prompt is bleeding tokens or whether a specific call hit a slow provider. These two panels turn cost and latency into things you can act on.