Speech-1: conversational speech model for Voice AI Agents
Announcing our next-generation speech model, designed for customer phone calls.
Announcing our next-generation speech model, designed for customer phone calls.
Today, we are announcing Speech-1, our new conversational speech model, purpose-built for customer phone calls.
Our near-term goal with our speech model is to deliver a voice experience that sounds as familiar and natural as a human agent. The latest model is a significant step towards our ultimate goal: helping developers build the best AI voice calling experience.
Our customers were frustrated with the tone, flow, and reliability of speech models today.
1 in 3 calls dropped within 10 seconds when answered by these speech models. Our customers operate at a minimum 90% customer satisfaction (CSAT) score today with human agents. But when they tried best-in-class solutions, CSAT dropped to a shocking 50% on the calls that did go through. Here’s how these issues sound in action.
Industry standard speech models sound realistic but over emphasise content creation with audiobook-like narration, crisp audio and speech patterns that you just wouldn't find in your average customer service call. These are being used to power most conversational agents today.
Speech-1 was built to boost customer satisfaction and call success rates. It upgrades the call experience in three key ways:
Speech-1 produces conversational patterns—tone, pace, pause, and expressions—that mirror human calls. This engages callers and builds trust.
No one wants AI that randomly sounds overly excited, angry, sad — or worse, shouts at callers. Our model understands the meaning of words in context and adjusts tone, rhythm, and expression accordingly. The result is a voice that sounds consistent, coherent, and human.
Speech-1 reliably speaks key details – names, spellings, numbers, emails, addresses, others – critical for a successful call. It does so with pacing, pauses and stress that's easy to follow.
We've tested with select customers, and the feedback has been overwhelmingly positive.
I’m impressed! Earlier solutions sounded janky, like pre-recorded messages stacked on top of one another. The new voices sound fluent, life-like! – Elliott Winter, Lottie
"Both clients & leads were hesitant to deploy AI agents as the voice sounded "off". The new voice is a game-changer, sounds like human-agents and has encouraged many clients to fully adopt AI receptionists.” – Charanjit Dabb, Flamingo Digital
We're now preparing to roll out to more customers. Stay tuned!
If you’re interested to make callers happy, boost margins, and increase revenue — we’d love to hear from you!
For those interested in the technicals, stay tuned for a future blog post.