Analytics
Per-agent cost, latency, and volume metrics — sourced from the same numbers we bill with.
osmTalk surfaces analytics at three levels:
- Dashboard — fleet-wide call volume, status mix, peak hours
- Per-agent — cost breakdown, latency breakdown, daily volume for a single agent (
/agents/[id]) - Per-call — token usage, latency-to-first-audio breakdown, cost components (
/calls/[id])
Per-agent cost breakdown
On the agent detail page, the "Where the money went" card shows the cumulative spend split across:
- LLM — token cost from the model provider (OpenAI / Anthropic / Groq)
- STT — per-minute speech-to-text (Deepgram / Sarvam / ElevenLabs / Groq Whisper)
- TTS — per-character text-to-speech (Deepgram Aura / ElevenLabs / Groq Orpheus)
- SIP — per-minute Plivo telephony egress (phone calls only)
Numbers come directly from calls.costLlm / costStt / costTts / costSip / costTotal — the same columns the billing flow writes after each call. They will always match your invoice.
The Total Spend figure includes our platform margin (already applied). You don't pay it twice.
Per-agent latency breakdown
The "Average Latency" grid shows four numbers averaged across the agent's calls:
| Metric | What it measures |
|---|---|
| End-to-end | User finished speaking → first audio frame from the bot |
| STT TTFB | First audio chunk → first speech-to-text token |
| LLM TTFB | LLM prompt sent → first token in response |
| TTS TTFB | Final LLM response → first audio sample from TTS |
The three TTFBs add up to roughly the end-to-end number (plus a few ms of internal routing).
If end-to-end is high (>1500 ms):
- STT dominant → switch to a faster STT model (Groq Whisper Turbo: $0.04/hr, ~80ms)
- LLM dominant → switch from gpt-5.4 to gpt-5.4-mini (~3x faster) or Groq Llama
- TTS dominant → switch to Deepgram Aura ($0.030/1K) or ElevenLabs Flash (~150ms first byte)
Per-call latency breakdown
The "Time-to-first-audio breakdown" stacked bar on the call detail page shows the same three TTFB segments for that specific call. Useful when a single call felt sluggish — you can pinpoint which service was the bottleneck on that turn without diving into raw event logs.
Cost trend
agentStats.dailyVolume[] includes a cost field per day, so the volume chart can render a cost-over-time overlay. Use this to spot:
- A talkative customer that drove up output tokens for that agent
- A change in system prompt that 3x'd LLM costs
- A switch from Deepgram TTS to ElevenLabs that doubled per-call cost
Programmatic access
const stats = await api.getAgentStats(agentId);
// stats.cost = { total, llm, stt, tts, sip, avgPerCall, currency: "INR" }
// stats.tokens = { prompt, completion }
// stats.latency = { avgTotal, avgLlmTtfb, avgTtsTtfb, avgSttTtfb } (seconds)
// stats.dailyVolume = [ { date, calls, completed, failed, avgDuration, cost } ]stats = osm._request("GET", f"/api/agents/{agent_id}/stats")Why this matters
Voice-AI pricing in 2026 ranges from $0.05 to $0.40 per minute depending on the model stack (per Vapi pricing teardowns). Without a per-agent / per-call breakdown you can't tell whether a specific agent's prompt is bleeding tokens or whether a specific call hit a slow provider. These two panels turn cost and latency into things you can act on.