Changelog
Version history and release notes for osmTalk.
v0.7.0 — Honest Call Outcomes (May 2026)
Bulk-call accuracy fixes
- Calls with no audio are now marked
status: "failed", not falselycompleted. The bot now tracks whether real audio flowed in either direction and the API routes silent calls through a new/api/calls/:id/failendpoint that skips billing. - 5-minute idle timeout now reports
endReason: "idle_timeout"instead of silently flipping tocompleted. Same for the provider error-circuit breaker (endReason: "provider_circuit_open") and bot-startup timeouts (endReason: "bot_startup_failed"). - New
calls.failureReasoncolumn + machine-readable enum of 10 reasons (no_audio_output,no_audio_either_direction,idle_timeout,provider_circuit_open,sip_no_answer,sip_rejected,bot_startup_failed,caller_hung_up_silently,stale_sweep,unknown). Full per-reason guide at Failure Reasons. - Campaign workers automatically retry retryable failures —
failedcalls with retryable reasons now hit yourretryPolicy.maxAttemptsinstead of being treated as terminal "no answer" outcomes. - Refund path — calls marked failed via this system have
billingStatus: "free", so they don't burn credits.
Dashboard
- Failure-reason banner on every failed call's detail page — title, plain-English cause, "who's to blame" pill (platform / caller / environment / carrier), and "what to try". Same copy as the SDK's
describeFailureReason()and the docs failure-reasons page.
Bot
- Storage health probe at bot startup — recordings silently failing because MinIO is unreachable now surface as a loud startup-time error and a 503 on
/health/deep. Previously this was only visible via per-call "Recording URL save FAILED" warnings, which are easy to miss. - Audio-flow tracking in
MetricsCollector— observesBotStartedSpeakingFrameandTranscriptionFrameto setbot_audio_flowed/user_speech_flowedandend_reasonproperties that the cleanup path posts to/complete.
SDK
@osmapi/osmtalk-sdk@0.5.0shipsCallRecord.failureReasonplus three helpers:describeFailureReason(),isRetryableFailure(),callConnected(). New filters oncalls.list({ campaignId, failureReason }).
Why
A bulk-call test produced 85 calls of which 81% were incorrectly marked completed despite zero audio flowing — phantom calls from a TTS connect-storm under concurrency. The platform was charging for silence and the campaign engine wasn't retrying. This release makes the call status honest, the retries automatic, and the dashboard self-explaining when something fails.
v0.6.0 — Voice Agent Platform Hardening (April 2026)
Voice Pipeline Upgrade to v1.0
- osmTalk Voice Pipeline v1.0 stable — upgraded from v0.0.102
- Sarvam SDK 0.1.26 — fixes outdated-SDK integration bugs
- WebSocket retry storm fix — bot no longer floods providers with requests after an auth/credit failure (caps at 3 rapid failures → non-fatal ErrorFrame instead of infinite reconnect)
- MCPClient lifecycle — now uses the required
await mcp.start()beforeregister_tools(llm)
Tool Calling Improvements
- Smart
cancel_on_interruptiondefaults — HTTP POST/PUT/DELETE tools default toFalse(won't half-finish transactions); GET tools default toTrue - Per-tool
cancelOnInterruptionoverride in tool config - Per-tool
timeoutSecs— set tight timeouts on fast APIs, loose on slow ones - Group parallel tools — LLM tool calls now run in parallel (default ON)
- Streaming intermediate results — handlers can emit progress via
result_callback(msg, is_final=False) - Function-args interrupt crash fix — no more
JSONDecodeErrorwhen user interrupts mid-tool
Voice Experience
- VAD
audio_idle_timeout— fixes VAD getting stuck in SPEAKING state (default 1.5s, override viaaudioIdleTimeoutSecssetting) - ElevenLabs 48kHz native audio — crisper voice, no upsampling artifacts (override via
ttsSampleRatesetting) - Warm handoff summary —
transfer_to_humannow sends the last 10 conversation turns to the human agent's API
New LLM Provider: OpenAI Responses API
- Opt-in via
openai-responsesprovider on agent - WebSocket-based incremental context for lower latency on long calls
- Falls back to Chat Completions on HTTP chat (WhatsApp/Web Chat) since Responses API is voice-only
DTMF Keypad Support
- Phone callers — DTMF digits from PSTN are captured and fed to the LLM context as user input (e.g., "Press 1 for English, 2 for Hindi")
- Web widget keypad UI — users can tap a 3×4 keypad that sends digits over the data channel
- Per-agent toggle via
enableDtmfsetting (auto-enabled for phone agents)
Tool Status UI
- Live tool-execution cards in the voice widget — users see "Looking that up…" spinner while HTTP tools run
- Auto-dismiss after completion (3s fade)
- Error cards show red X and keep showing until the call ends
Observability
- Sentry integration (both bot + API) — set
SENTRY_DSNenv var to enable /api/health/deependpoint — probes Database, Redis, Bot, voice transport, and MinIO- Bot
/health/deependpoint — reports active call count, Sentry status, version - Latency metrics dashboard (already existed) — per-call TTFB breakdown across STT/LLM/TTS
Graceful Shutdown
- Bot SIGTERM handler — active calls hear "I need to end this call for a system update, please call back" before disconnecting
- 20-second grace period (configurable via
SHUTDOWN_GRACE_SECS) for in-flight TTS to finish - Prevents dropped calls during deploys
System Prompt Fixes
- Web chat now sends dynamic context (agent name, date/time, format rules, language rules) — previously only raw prompt
- Same behavior across all 4 channels (web call, phone call, web chat, WhatsApp chat)
Security & Cleanup
- Removed deprecated models from catalog:
llama-4-maverick,anthropic/claude-sonnet-4-5-20250929(on OpenRouter),saaras:v2.5,scribe_v2_realtime - Validated all remaining STT/TTS/LLM models against live provider APIs (35 total verified)
v0.5.0 — Shared Credits & Billing (April 2026)
Billing & Payments
- Usage-based credits — Pay per call, no subscriptions
- Shared credits with osmAPI — Single balance across both platforms
- Razorpay integration — Top up via card, UPI, or netbanking
- Billing dashboard — Balance, transactions, usage with INR/USD toggle
- Pre-call balance check — Calls rejected if credits below ₹1
- Cost breakdown per call — See LLM, STT, TTS, and SIP costs individually
- Chat billing — LLM token costs deducted for chat messages
- Inbound call billing — SIP charges for incoming phone calls
- WhatsApp call billing — SIP charges for WhatsApp voice calls
- Minimum charge — ₹1 minimum per call
- Phone number setup fee — ₹50 one-time on purchase
- Monthly phone rental — Automatic billing via background job
- Low balance alerts — Email when credits drop below ₹10
- Phone deactivation — Numbers auto-deactivated when rental unpaid
- Data retention — Free 30-day storage, cleanup after expiry
Auth
- Shared auth with osmAPI — Single login across both platforms
- Login/register redirects to osmAPI (email, Google, GitHub)
- Cross-domain token-based authentication
- Redis session cache (5-minute TTL)
- Auth guard on all dashboard pages
Enterprise Features
- Decimal.js precision — No floating-point billing errors
- Atomic transactions — All-or-nothing credit deduction
- Rate limiting — Per IP, user, and API key
- Structured logging — Pino JSON logs for production
- Background jobs — Phone billing + data retention cron endpoints
- Idempotent webhooks — Duplicate payment protection
Infrastructure
- Dual database (OsmTalk + StartFlow shared DB on Neon)
- Docker PostgreSQL removed (migrated to Neon cloud)
- Multi-organization support with org switcher
- Projects table for grouping agents
Docs
- Billing documentation (overview, credits, pricing, top-up)
- API reference for all endpoints
- Phone number guides updated
- Changelog added
v0.4.0 — MCP Server & Outbound Calls (March 2026)
New Features
- MCP Server — Make phone calls from Claude Desktop (npm)
- openingMessage — Agent speaks a pre-written message instantly when call connects (no LLM delay)
- callerName — Agent identifies who it's calling on behalf of
- Call history via MCP — View calls, transcripts, and analytics from Claude
- Dashboard analytics — Total calls, success rate, top agents
Improvements
- Widget outbound call validation (instruction length, E.164 phone format)
- MCP tools:
list_calls,get_dashboard,get_call_result
v0.3.0 — WhatsApp & Call Transfer (February 2026)
New Features
- WhatsApp integration — Connect WhatsApp Business numbers to voice agents
- WhatsApp calling — Inbound and outbound calls via WhatsApp
- Call transfer — Transfer active calls to a human agent or another number
- Context summarization — Automatic conversation summarization for long calls
Improvements
- ElevenLabs STT/TTS support (Scribe v2, Flash v2.5, Turbo v2.5)
- Smart turn detection for natural conversations
- Voicemail detection for outbound calls
- End-call detection (agent hangs up when user says goodbye)
v0.2.0 — Phone Numbers & SIP (January 2026)
New Features
- Phone number provisioning — Buy Indian phone numbers and assign to agents
- Inbound calls — Assign agents to phone numbers for automatic answering
- Outbound calls — Dial any number with a voice agent
- SIP integration — osmTalk SIP gateway for PSTN connectivity
- Call recordings — Automatic recording with MinIO storage
- Widget embed — Embeddable voice/chat widget for websites
Improvements
- Background sounds (office, nature, cafe)
- Advanced VAD settings (confidence, start/stop times)
- Multiple TTS voices per provider
v0.1.0 — Initial Release (December 2025)
Features
- Voice agent creation with custom system prompts
- Multi-provider support: OpenAI, Groq, Anthropic, Deepgram, Sarvam
- Real-time voice calls via osmTalk's WebRTC transport
- Chat interface (text-based conversations)
- Call transcripts and metrics
- Team management with role-based access
- Agent tools (HTTP tools, client tools, MCP servers)
- Dashboard with call analytics