Changelog

v0.7.0 — Honest Call Outcomes (May 2026)

Bulk-call accuracy fixes

Calls with no audio are now marked status: "failed", not falsely completed. The bot now tracks whether real audio flowed in either direction and the API routes silent calls through a new /api/calls/:id/fail endpoint that skips billing.
5-minute idle timeout now reports endReason: "idle_timeout" instead of silently flipping to completed. Same for the provider error-circuit breaker (endReason: "provider_circuit_open") and bot-startup timeouts (endReason: "bot_startup_failed").
New calls.failureReason column + machine-readable enum of 10 reasons (no_audio_output, no_audio_either_direction, idle_timeout, provider_circuit_open, sip_no_answer, sip_rejected, bot_startup_failed, caller_hung_up_silently, stale_sweep, unknown). Full per-reason guide at Failure Reasons.
Campaign workers automatically retry retryable failures — failed calls with retryable reasons now hit your retryPolicy.maxAttempts instead of being treated as terminal "no answer" outcomes.
Refund path — calls marked failed via this system have billingStatus: "free", so they don't burn credits.

Dashboard

Failure-reason banner on every failed call's detail page — title, plain-English cause, "who's to blame" pill (platform / caller / environment / carrier), and "what to try". Same copy as the SDK's describeFailureReason() and the docs failure-reasons page.

Bot

Storage health probe at bot startup — recordings silently failing because MinIO is unreachable now surface as a loud startup-time error and a 503 on /health/deep. Previously this was only visible via per-call "Recording URL save FAILED" warnings, which are easy to miss.
Audio-flow tracking in MetricsCollector — observes BotStartedSpeakingFrame and TranscriptionFrame to set bot_audio_flowed / user_speech_flowed and end_reason properties that the cleanup path posts to /complete.

SDK

@osmapi/osmtalk-sdk@0.5.0 ships CallRecord.failureReason plus three helpers: describeFailureReason(), isRetryableFailure(), callConnected(). New filters on calls.list({ campaignId, failureReason }).

A bulk-call test produced 85 calls of which 81% were incorrectly marked completed despite zero audio flowing — phantom calls from a TTS connect-storm under concurrency. The platform was charging for silence and the campaign engine wasn't retrying. This release makes the call status honest, the retries automatic, and the dashboard self-explaining when something fails.

v0.6.0 — Voice Agent Platform Hardening (April 2026)

Voice Pipeline Upgrade to v1.0

osmTalk Voice Pipeline v1.0 stable — upgraded from v0.0.102
Sarvam SDK 0.1.26 — fixes outdated-SDK integration bugs
WebSocket retry storm fix — bot no longer floods providers with requests after an auth/credit failure (caps at 3 rapid failures → non-fatal ErrorFrame instead of infinite reconnect)
MCPClient lifecycle — now uses the required await mcp.start() before register_tools(llm)

Tool Calling Improvements

Smart cancel_on_interruption defaults — HTTP POST/PUT/DELETE tools default to False (won't half-finish transactions); GET tools default to True
Per-tool cancelOnInterruption override in tool config
Per-tool timeoutSecs — set tight timeouts on fast APIs, loose on slow ones
Group parallel tools — LLM tool calls now run in parallel (default ON)
Streaming intermediate results — handlers can emit progress via result_callback(msg, is_final=False)
Function-args interrupt crash fix — no more JSONDecodeError when user interrupts mid-tool

Voice Experience

VAD audio_idle_timeout — fixes VAD getting stuck in SPEAKING state (default 1.5s, override via audioIdleTimeoutSecs setting)
ElevenLabs 48kHz native audio — crisper voice, no upsampling artifacts (override via ttsSampleRate setting)
Warm handoff summary — transfer_to_human now sends the last 10 conversation turns to the human agent's API

New LLM Provider: OpenAI Responses API

Opt-in via openai-responses provider on agent
WebSocket-based incremental context for lower latency on long calls
Falls back to Chat Completions on HTTP chat (WhatsApp/Web Chat) since Responses API is voice-only

DTMF Keypad Support

Phone callers — DTMF digits from PSTN are captured and fed to the LLM context as user input (e.g., "Press 1 for English, 2 for Hindi")
Web widget keypad UI — users can tap a 3×4 keypad that sends digits over the data channel
Per-agent toggle via enableDtmf setting (auto-enabled for phone agents)

Tool Status UI

Live tool-execution cards in the voice widget — users see "Looking that up…" spinner while HTTP tools run
Auto-dismiss after completion (3s fade)
Error cards show red X and keep showing until the call ends

Observability

Sentry integration (both bot + API) — set SENTRY_DSN env var to enable
/api/health/deep endpoint — probes Database, Redis, Bot, voice transport, and MinIO
Bot /health/deep endpoint — reports active call count, Sentry status, version
Latency metrics dashboard (already existed) — per-call TTFB breakdown across STT/LLM/TTS

Graceful Shutdown

Bot SIGTERM handler — active calls hear "I need to end this call for a system update, please call back" before disconnecting
20-second grace period (configurable via SHUTDOWN_GRACE_SECS) for in-flight TTS to finish
Prevents dropped calls during deploys

System Prompt Fixes

Web chat now sends dynamic context (agent name, date/time, format rules, language rules) — previously only raw prompt
Same behavior across all 4 channels (web call, phone call, web chat, WhatsApp chat)

Security & Cleanup

Removed deprecated models from catalog: llama-4-maverick, anthropic/claude-sonnet-4-5-20250929 (on OpenRouter), saaras:v2.5, scribe_v2_realtime
Validated all remaining STT/TTS/LLM models against live provider APIs (35 total verified)

v0.5.0 — Shared Credits & Billing (April 2026)

Billing & Payments

Usage-based credits — Pay per call, no subscriptions
Shared credits with osmAPI — Single balance across both platforms
Razorpay integration — Top up via card, UPI, or netbanking
Billing dashboard — Balance, transactions, usage with INR/USD toggle
Pre-call balance check — Calls rejected if credits below ₹1
Cost breakdown per call — See LLM, STT, TTS, and SIP costs individually
Chat billing — LLM token costs deducted for chat messages
Inbound call billing — SIP charges for incoming phone calls
WhatsApp call billing — SIP charges for WhatsApp voice calls
Minimum charge — ₹1 minimum per call
Phone number setup fee — ₹50 one-time on purchase
Monthly phone rental — Automatic billing via background job
Low balance alerts — Email when credits drop below ₹10
Phone deactivation — Numbers auto-deactivated when rental unpaid
Data retention — Free 30-day storage, cleanup after expiry

Auth

Shared auth with osmAPI — Single login across both platforms
Login/register redirects to osmAPI (email, Google, GitHub)
Cross-domain token-based authentication
Redis session cache (5-minute TTL)
Auth guard on all dashboard pages

Enterprise Features

Decimal.js precision — No floating-point billing errors
Atomic transactions — All-or-nothing credit deduction
Rate limiting — Per IP, user, and API key
Structured logging — Pino JSON logs for production
Background jobs — Phone billing + data retention cron endpoints
Idempotent webhooks — Duplicate payment protection

Infrastructure

Dual database (OsmTalk + StartFlow shared DB on Neon)
Docker PostgreSQL removed (migrated to Neon cloud)
Multi-organization support with org switcher
Projects table for grouping agents

Docs

Billing documentation (overview, credits, pricing, top-up)
API reference for all endpoints
Phone number guides updated
Changelog added

v0.4.0 — MCP Server & Outbound Calls (March 2026)

New Features

MCP Server — Make phone calls from Claude Desktop (npm)
openingMessage — Agent speaks a pre-written message instantly when call connects (no LLM delay)
callerName — Agent identifies who it's calling on behalf of
Call history via MCP — View calls, transcripts, and analytics from Claude
Dashboard analytics — Total calls, success rate, top agents