StreamingMemeStreamingMeme
LeaderboardsEventsSubmit News
SUBSCRIBE

Daily Brief

The streaming industry in your inbox every morning.

Daily Brief

The streaming industry in your inbox every morning.

StreamingMeme

The streaming technology industry news aggregator.

About UsNewsletterSubmit News
© 2026 StreamingMeme. All rights reserved.
← AI for Video
AI & VideoProduct Launch

NVIDIA's Nemotron 3.5 ASR Offers Real-Time Speech-to-Text in 40 Languages

NVIDIA's Nemotron 3.5 ASR Offers Real-Time Speech-to-Text in 40 Languages
huggingface

NVIDIA has released Nemotron 3.5 ASR, a 600M-parameter speech-to-text model that supports 40 languages in real time, offering low latency and high accuracy with built-in punctuation and capitalization. The model is open-weights, fine-tunable, and addresses common challenges in multilingual speech recognition for streaming video applications. It provides a detailed guide on how to fine-tune the model for specific languages or domains.

Key Takeaways

  • Nemotron 3.5 ASR transcribes 40 language-locales from a single 600M-parameter checkpoint with real-time performance.
  • The model incorporates punctuation and capitalization natively, eliminating the need for post-processing.
  • Its Cache-Aware FastConformer-RNNT architecture processes each audio frame once, providing low latency (down to 80ms) and high accuracy without recomputation.
  • Fine-tuning options allow for adapting the model to specific languages, domains, or accents, with demonstrated WER improvements of 31-32% for under-resourced languages like Greek and Bulgarian.
  • The model supports dynamic latency configuration via `att_context_size` at inference time, ranging from 80ms (ultra-low) to 1.12s (high accuracy).

Why It Matters

This release directly impacts streaming video applications requiring low-latency, accurate, and multilingual speech-to-text capabilities, such as live captions, voice agents, and call-center analytics. By offering a single, fine-tunable, open-weights model for 40 languages, NVIDIA reduces infrastructure complexity and costs associated with managing multiple APIs or models. The configurable latency and native punctuation capabilities also streamline development. Moving forward, watch for adoption rates and independent benchmarks of Nemotron 3.5 ASR in diverse production environments, especially how its fine-tuning capabilities are leveraged for long-tail languages and specialized domains.


Read full article at huggingface.co

Related Articles

Agora: Agora Integrates OpenAI Real-Time API for Low-Latency Conversational AI
Amazon Web Services, Inc.: AWS SageMaker Adds Multi-Turn RL for Specialized AI Model Training
wTVision: wTVision Debuts CricketStats CG, Enters Cricket Graphics Market in Bangladesh

Newest

about 12 hours ago
Pro AVL Central: Blackmagic Debuts Fairlight Live, Boosts DaVinci Resolve 21 with AI and Photo Tools
about 12 hours ago
NewscastStudio: MXL Rapid Development Challenges Traditional Broadcast Standardization
about 12 hours ago
Smpte: SMPTE Media Technology Summit Returns to Pasadena November 2026
about 12 hours ago
Tech Times: Let's Encrypt charts Merkle Tree Certificate path for post-quantum TLS
about 12 hours ago
cvefeed.io: Netty Fixes Undetected Stream Truncation in Chunked OHTTP Messages
about 12 hours ago
Ietf: IETF Advances Network Protocol Drafts for Streaming Infrastructure
about 12 hours ago
Forasoft: Fora Soft Launches Monthly WebRTC & Real-time Video Engineering Report
about 12 hours ago
Atis: ATIS Outlines Practical Roadmap for North American 5G Standalone Deployment
about 12 hours ago
Youtube: 3GPP Advances 5G-Advanced with Release 19, Commences 6G Studies
about 12 hours ago
3gpp: 3GPP Release 6 Refines Radio Network Rules for Cell Handover, Measurement
about 12 hours ago
3gpp: 3GPP Details 20 Mobile Telecommunications Releases, Including Open Release 21
about 12 hours ago
Pro AVL Central: Matrox Launches IPMX-Ready Maevex MGX Series for 4K60 AV-over-IP
about 12 hours ago
GitHub: OpenMOSS Expands MOSS-TTS Family with Nano Model, Enhanced SoundEffects
about 12 hours ago
NewscastStudio: Media Exchange Layer (MXL) Complements ST 2110 for Software-Defined Production
about 12 hours ago
Penligent Security Blog – AI-Driven Hacking Tutorials, Exploit PoCs & Cybersecurity Research: HTTP/2 Bomb Vulnerability: Apache, Envoy, Nginx Face DoS Risk
about 12 hours ago
SamsungNewsroom: Samsung Galaxy S26 Series Introduces Cine LUT for Accessible Mobile Color Grading
about 12 hours ago
KORE1: Spotify Engineers: A Six-Profile Map for Strategic Hiring
about 12 hours ago
TV Tech: GatesAir Establishes Brazil Hub for DTV+ Rollout, Local Support
about 12 hours ago
Telecompaper: Technicolor Joins Pearl TV Initiative for Affordable ATSC 3.0 Converter Boxes
about 12 hours ago
law360: Generative AI, SEPs Drive IP Licensing Activity from May 22-June 4

Upcoming Events

Jun
8–11
NEM Dubrovnikhttps://neweumarket.com/dubrovnik/
Jun
11–12
Arctic 15https://arctic15.com/
Jun
13–19
InfoCommhttps://www.infocommshow.org/
Jun
16–19
Stream TV Show (formerly the Pay TV Show)https://www.streamtvshow.com/
Jun
17–19
Content Tokyo 2024https://www.content-tokyo.jp/ja-jp.html
View all events →

Top Sources

  1. 1.wTVision163
  2. 2.MSN152
  3. 3.Calendly86
  4. 4.Advanced Television63
  5. 5.Sports Video Group62
  6. 6.TV Technology40
  7. 7.Cord Cutters News40
  8. 8.Broadband TV News35
Full leaderboards →

Newest

about 12 hours ago
Pro AVL Central: Blackmagic Debuts Fairlight Live, Boosts DaVinci Resolve 21 with AI and Photo Tools
about 12 hours ago
NewscastStudio: MXL Rapid Development Challenges Traditional Broadcast Standardization
about 12 hours ago
Smpte: SMPTE Media Technology Summit Returns to Pasadena November 2026
about 12 hours ago
Tech Times: Let's Encrypt charts Merkle Tree Certificate path for post-quantum TLS
about 12 hours ago
cvefeed.io: Netty Fixes Undetected Stream Truncation in Chunked OHTTP Messages
about 12 hours ago
Ietf: IETF Advances Network Protocol Drafts for Streaming Infrastructure
about 12 hours ago
Forasoft: Fora Soft Launches Monthly WebRTC & Real-time Video Engineering Report
about 12 hours ago
Atis: ATIS Outlines Practical Roadmap for North American 5G Standalone Deployment
about 12 hours ago
Youtube: 3GPP Advances 5G-Advanced with Release 19, Commences 6G Studies
about 12 hours ago
3gpp: 3GPP Release 6 Refines Radio Network Rules for Cell Handover, Measurement
about 12 hours ago
3gpp: 3GPP Details 20 Mobile Telecommunications Releases, Including Open Release 21
about 12 hours ago
Pro AVL Central: Matrox Launches IPMX-Ready Maevex MGX Series for 4K60 AV-over-IP
about 12 hours ago
GitHub: OpenMOSS Expands MOSS-TTS Family with Nano Model, Enhanced SoundEffects
about 12 hours ago
NewscastStudio: Media Exchange Layer (MXL) Complements ST 2110 for Software-Defined Production
about 12 hours ago
Penligent Security Blog – AI-Driven Hacking Tutorials, Exploit PoCs & Cybersecurity Research: HTTP/2 Bomb Vulnerability: Apache, Envoy, Nginx Face DoS Risk
about 12 hours ago
SamsungNewsroom: Samsung Galaxy S26 Series Introduces Cine LUT for Accessible Mobile Color Grading
about 12 hours ago
KORE1: Spotify Engineers: A Six-Profile Map for Strategic Hiring
about 12 hours ago
TV Tech: GatesAir Establishes Brazil Hub for DTV+ Rollout, Local Support
about 12 hours ago
Telecompaper: Technicolor Joins Pearl TV Initiative for Affordable ATSC 3.0 Converter Boxes
about 12 hours ago
law360: Generative AI, SEPs Drive IP Licensing Activity from May 22-June 4

Upcoming Events

Jun
8–11
NEM Dubrovnikhttps://neweumarket.com/dubrovnik/
Jun
11–12
Arctic 15https://arctic15.com/
Jun
13–19
InfoCommhttps://www.infocommshow.org/
Jun
16–19
Stream TV Show (formerly the Pay TV Show)https://www.streamtvshow.com/
Jun
17–19
Content Tokyo 2024https://www.content-tokyo.jp/ja-jp.html
View all events →

Top Sources

  1. 1.wTVision163
  2. 2.MSN152
  3. 3.Calendly86
  4. 4.Advanced Television63
  5. 5.Sports Video Group62
  6. 6.TV Technology40
  7. 7.Cord Cutters News40
  8. 8.Broadband TV News35
Full leaderboards →