StreamingMemeStreamingMeme
LeaderboardsEventsSubmit News
SUBSCRIBE

Daily Brief

The streaming industry in your inbox every morning.

Daily Brief

The streaming industry in your inbox every morning.

StreamingMeme

The streaming technology industry news aggregator.

About UsNewsletterSubmit News
© 2026 StreamingMeme. All rights reserved.
← AI for Video
AI & VideoTechnical DevelopmentJune 7, 2026

NVIDIA Details Video Summarization Microservice Performance on H100, RTX, L40S GPUs

NVIDIA Details Video Summarization Microservice Performance on H100, RTX, L40S GPUs
Nvidia

NVIDIA has released performance metrics for its Video Summarization microservice, detailing end-to-end latency and maximum concurrent request capacity. The data is provided across various video lengths and GPU platforms, including H100, RTX Professional 6000 SE, and L40S. This information helps streaming professionals size GPU infrastructure for AI-driven video processing workloads.

Key Takeaways

  • NVIDIA's Video Summarization microservice uses RTVI-VLM for vision inference and Nemotron 3 Nano for summarization, operating at FP8 model precision.
  • An H100 GPU configuration with a 4x4 topology summarizes a 10-minute video in 20.1 seconds, achieving 125 concurrent requests for a 10-minute video at target latency.
  • RTX Professional 6000 SE, with a 4x4 topology, summarizes a 10-minute video in 25.9 seconds and handles 79 concurrent requests for a 10-minute video.
  • The L40S GPU, using a 4x4 topology, processes a 10-minute video summary in 39.1 seconds, supporting 41 concurrent requests for a 10-minute video.
  • Performance metrics are provided for video lengths ranging from 1 minute to 720 minutes, tested in a warehouse safety monitoring scenario.

Why It Matters

NVIDIA's detailed performance benchmarks provide concrete data for optimizing GPU infrastructure in AI-driven video processing. This allows enterprises to make informed decisions on hardware deployments for real-time video summarization, impacting efficiency and cost for applications like content moderation, security, and media analysis. The focus on specific GPU models and configurations highlights NVIDIA's push to integrate its hardware deeper into the video AI pipeline. Going forward, watch for adoption rates of these specific GPU configurations in enterprise video AI deployments and any subsequent updates on real-world performance at scale.

Additional Context

NVIDIA has been actively enhancing its video AI capabilities, with the Video Summarization microservice (VSS) representing a key component. In May 2026, NVIDIA released a Metropolis Blueprint for VSS, which aims to transform large volumes of video into searchable, actionable intelligence (NVIDIA Developer Blog, May 2026). This blueprint emphasizes a modular design, advanced fusion search, and integration with AI agents like Codex and OpenClaw, enabling automated deployment and interaction via chat interfaces. The VSS architecture provides a reference for building video analytics AI agents that perceive, reason, and act on live and recorded video streams. VSS also supports the use of OpenAI-compatible Vision-Language Models (VLMs) and Large Language Models (LLMs), alongside its own optimized VLMs like CR1, CR2, and Qwen, for flexible model selection (NVIDIA VSS documentation, current). Furthermore, the VSS microservice offers both REST API and Model Context Protocol (MCP) interfaces, facilitating integration into diverse workflows and AI orchestration systems (NVIDIA VSS documentation, current).


Read full article at docs.nvidia.com

Related Articles

Nvidia: NVIDIA Benchmarks VSS Alert Bridge Performance for AI Video Analytics
GitHub: Eindhoven, RWTH Aachen Detail Real-Time Video Segmentation Model VidEoMT
Nvidia: NVIDIA Integrates SigLIP 2 Object Embeddings into VSS 3.2.0 for Video AI

Newest

about 19 hours ago
Advanced-television: Portugal Fines Telcos €13.3M for Colluding on TV Ad Sales via Playce Platform
about 19 hours ago
Agora: Agora highlights chat APIs for player retention in social gaming
about 19 hours ago
Ministry of Sport: TNT Sports Secures Commonwealth Games UK Broadcast Rights, Ending BBC's 72-Year Run
about 19 hours ago
indexbox: AI Server Chassis Market to Exceed $13B by 2035 Amid Cooling Shift
about 19 hours ago
huggingface: MLX Port for 24-Language Voice-Clone TTS Reduces Model Size by 73%
about 19 hours ago
Lucintel: Thailand's Video Codec Market to Hit $7.9B by 2031 on 5G, OTT Growth
about 19 hours ago
Xzcomm: Xinzhi Introduces 8-in-1 SD Encoder for ISDB-T, Targeting Low-Bitrate Applications
about 19 hours ago
Ubuy Guadeloupe: URayTech Launches 8-Channel HEVC/H.265 HDMI to IP Encoder for Live Streaming
about 19 hours ago
Google: Google Cloud Positions Compute Engine for Streaming Workloads
about 19 hours ago
Indian Advertising Media & Marketing News – exchange4media: India's MIB Directs BARC: No TRP Fees for News Channels During Blackout
about 19 hours ago
Tulix: Tulix Launches 'Heavy-Edge' for Distributed Video Processing
about 19 hours ago
nationthailand:
about 19 hours ago
Digitalrebellion: Digital Rebellion’s Kollaborate Server Beta Adds VP8, VP9, HEVC, AV1 Support
about 19 hours ago
nationthailand: Thailand's NBTC Maps Digital TV Future Post-2029 Amid Industry Pressure
about 19 hours ago
Agora: Agora Launches Convo AI Device Kit for Real-Time Conversational AI in IoT
about 19 hours ago
SiliconANGLE: Nvidia Partners with SK Hynix, Naver, Doosan to Boost South Korea's AI Infrastructure
about 19 hours ago
Info Nasional - World: Synology Boosts On-Prem AI with GPU NAS, Expands Surveillance & Backup
about 19 hours ago
Light Reading: Tencent Partners with Handset Makers to Embed WeChat AI in Devices
about 19 hours ago
Agora: Agora Launches Real-Time Speech-to-Text Translation with Sub-Second Latency, AI Integration
about 19 hours ago
MacRumors Forums: Apple Silicon Hardware Accelerates H.265 Transcoding via HandBrake

Upcoming Events

Jun
11–12
Arctic 15https://arctic15.com/
Jun
13–19
InfoCommhttps://www.infocommshow.org/
Jun
16–19
Stream TV Show (formerly the Pay TV Show)https://www.streamtvshow.com/
Jun
17–19
Content Tokyo 2024https://www.content-tokyo.jp/ja-jp.html
Jun
22–25
CineEuropehttp://www.filmexpos.com/cineeurope/
View all events →

Top Sources

  1. 1.wTVision162
  2. 2.MSN150
  3. 3.Calendly86
  4. 4.Advanced Television63
  5. 5.Sports Video Group62
  6. 6.Cord Cutters News44
  7. 7.TV Technology39
  8. 8.TechRadar36
Full leaderboards →

Newest

about 19 hours ago
Advanced-television: Portugal Fines Telcos €13.3M for Colluding on TV Ad Sales via Playce Platform
about 19 hours ago
Agora: Agora highlights chat APIs for player retention in social gaming
about 19 hours ago
Ministry of Sport: TNT Sports Secures Commonwealth Games UK Broadcast Rights, Ending BBC's 72-Year Run
about 19 hours ago
indexbox: AI Server Chassis Market to Exceed $13B by 2035 Amid Cooling Shift
about 19 hours ago
huggingface: MLX Port for 24-Language Voice-Clone TTS Reduces Model Size by 73%
about 19 hours ago
Lucintel: Thailand's Video Codec Market to Hit $7.9B by 2031 on 5G, OTT Growth
about 19 hours ago
Xzcomm: Xinzhi Introduces 8-in-1 SD Encoder for ISDB-T, Targeting Low-Bitrate Applications
about 19 hours ago
Ubuy Guadeloupe: URayTech Launches 8-Channel HEVC/H.265 HDMI to IP Encoder for Live Streaming
about 19 hours ago
Google: Google Cloud Positions Compute Engine for Streaming Workloads
about 19 hours ago
Indian Advertising Media & Marketing News – exchange4media: India's MIB Directs BARC: No TRP Fees for News Channels During Blackout
about 19 hours ago
Tulix: Tulix Launches 'Heavy-Edge' for Distributed Video Processing
about 19 hours ago
nationthailand:
about 19 hours ago
Digitalrebellion: Digital Rebellion’s Kollaborate Server Beta Adds VP8, VP9, HEVC, AV1 Support
about 19 hours ago
nationthailand: Thailand's NBTC Maps Digital TV Future Post-2029 Amid Industry Pressure
about 19 hours ago
Agora: Agora Launches Convo AI Device Kit for Real-Time Conversational AI in IoT
about 19 hours ago
SiliconANGLE: Nvidia Partners with SK Hynix, Naver, Doosan to Boost South Korea's AI Infrastructure
about 19 hours ago
Info Nasional - World: Synology Boosts On-Prem AI with GPU NAS, Expands Surveillance & Backup
about 19 hours ago
Light Reading: Tencent Partners with Handset Makers to Embed WeChat AI in Devices
about 19 hours ago
Agora: Agora Launches Real-Time Speech-to-Text Translation with Sub-Second Latency, AI Integration
about 19 hours ago
MacRumors Forums: Apple Silicon Hardware Accelerates H.265 Transcoding via HandBrake

Upcoming Events

Jun
11–12
Arctic 15https://arctic15.com/
Jun
13–19
InfoCommhttps://www.infocommshow.org/
Jun
16–19
Stream TV Show (formerly the Pay TV Show)https://www.streamtvshow.com/
Jun
17–19
Content Tokyo 2024https://www.content-tokyo.jp/ja-jp.html
Jun
22–25
CineEuropehttp://www.filmexpos.com/cineeurope/
View all events →

Top Sources

  1. 1.wTVision162
  2. 2.MSN150
  3. 3.Calendly86
  4. 4.Advanced Television63
  5. 5.Sports Video Group62
  6. 6.Cord Cutters News44
  7. 7.TV Technology39
  8. 8.TechRadar36
Full leaderboards →