


ASR is the easy bit. Voice agents are the bit that breaks.
Three things stood out at T-Bot 👾 . // NVIDIA AI // Daily's Speech AI Meetup in SF:
🎙 NVIDIA's open-weights speech models keep getting sharper. Nemotron Streaming ASR is out, and Jagadeesh Balam confirmed context-aware streaming is in flight: "2 to 3 months and you'll see something." Plenty for the open community to build on.
🎙 Artificial Analysis flagged five under-benchmarked S2S capabilities: multilingual, steerability, safety, audio-beyond-speech, robustness. AA's where we already sit at the sharp end on accuracy, and these five are exactly the surface where conversations succeed or break in production.
🎙 Kwindla Hultman Kramer (Daily / Pipecat) put Voice as a top-level pillar of AI-native software, alongside subagents, very-long context, and dynamic UIs. Voice as the interface, not a feature.
Best line of the night, paraphrased: "ASR we thought was solved. Voice agents brought their own problems." Emails, addresses, names, mid-conversation corrections — the noisy real world. Latency is largely a solved problem. Quality of conversation isn't, and that's the bar we wake up for at Speechmatics. Our recent push on sharper alphanumerics is exactly the kind of work the panel was asking for, and we're actively looking into context-aware ASR for the same accuracy gains.
And BRAVO to T-Bot 👾 . for his first event on the West Coast!
Thanks Adi Margolin + JoAnn Peach + Maryam Motamedi (NVIDIA), T-Bot 👾 ., Kwin + Nina Kuruvilla (Daily), and Chroma. Lovely to connect with George Cameron + Kiriill Butler (Artificial Analysis), Sridhar Krishna Nemala + Tara Bogavelli (ServiceNow), Fabian Seipel + Manolo Espinosa (ai-coustics), and many others.
#VoiceAI #VoiceAISpace #SpeechAI #NVIDIA #Pipecat
Three things stood out at T-Bot 👾 . // NVIDIA AI // Daily's Speech AI Meetup in SF:
🎙 NVIDIA's open-weights speech models keep getting sharper. Nemotron Streaming ASR is out, and Jagadeesh Balam confirmed context-aware streaming is in flight: "2 to 3 months and you'll see something." Plenty for the open community to build on.
🎙 Artificial Analysis flagged five under-benchmarked S2S capabilities: multilingual, steerability, safety, audio-beyond-speech, robustness. AA's where we already sit at the sharp end on accuracy, and these five are exactly the surface where conversations succeed or break in production.
🎙 Kwindla Hultman Kramer (Daily / Pipecat) put Voice as a top-level pillar of AI-native software, alongside subagents, very-long context, and dynamic UIs. Voice as the interface, not a feature.
Best line of the night, paraphrased: "ASR we thought was solved. Voice agents brought their own problems." Emails, addresses, names, mid-conversation corrections — the noisy real world. Latency is largely a solved problem. Quality of conversation isn't, and that's the bar we wake up for at Speechmatics. Our recent push on sharper alphanumerics is exactly the kind of work the panel was asking for, and we're actively looking into context-aware ASR for the same accuracy gains.
And BRAVO to T-Bot 👾 . for his first event on the West Coast!
Thanks Adi Margolin + JoAnn Peach + Maryam Motamedi (NVIDIA), T-Bot 👾 ., Kwin + Nina Kuruvilla (Daily), and Chroma. Lovely to connect with George Cameron + Kiriill Butler (Artificial Analysis), Sridhar Krishna Nemala + Tara Bogavelli (ServiceNow), Fabian Seipel + Manolo Espinosa (ai-coustics), and many others.
#VoiceAI #VoiceAISpace #SpeechAI #NVIDIA #Pipecat
Shared byReese Cole - 11 days ago
Log in to comment
Loading ..
Related Articles
Voice AI Meetup Highlights: Learning and Networking at Chroma and NVIDIA
Explore Cutting-Edge Voice and Multi-Modal AI at the SF Voice AI Meetup
Explore Open Source Models in Voice Agent Demos 🧨 #NeMoTron
The Underrated Patience of Voice Agents in Various Scenarios
Discover the Power of Patience in Voice Agents: Transforming Interactions
NVIDIA Developer Meetup: Shaping Real-time Voice AI in SF on May 7
69
0/100