Bỏ qua đến nội dung chính
Back to home
AI 1 min read

Cartesia Launches Ink-2, Topping the Streaming Speech-to-Text Leaderboard 🎙️

Cartesia introduces its new Ink-2 model, claiming the #1 spot on the AA leaderboard for streaming speech-to-text. The model is optimized for real-time voice AI agents.

Tier 1 · sources 90% confidence Reviewed
Sources x.com

Cartesia has just announced the Ink-2 model, a next-generation streaming speech-to-text (STT) solution that has secured the top spot on Artificial Analysis's (AA) leaderboard. This model focuses on reducing latency and is optimized for voice interaction tasks.

Developments

According to the Cartesia team, Ink-2 comes with numerous features specifically fine-tuned for real-time AI agents. By owning top-tier models in both text-to-speech (TTS) and speech-to-text (STT), Cartesia is cementing its position in the conversational AI infrastructure space.

Why It Matters

Real-time voice interaction has been a major barrier to natural AI agent experiences. Ink-2's high performance on the AA leaderboard represents a significant breakthrough in latency and accuracy for applications such as AI call centers, virtual assistants, or voice control systems without long cloud processing delays.