NVIDIA Benchmarks VSS Alert Bridge Performance for AI Video Analytics
NVIDIA benchmarked its VSS Alert Bridge for security alert verification across several of its hardware platforms, including RTX PRO 6000 WE, DGX Spark, AGX Thor, and DGX H100 SXM. The tests measured end-to-end latency, throughput, and per-stage latency using RT-DETR and Grounding DINO models, examining both local and remote VLM deployments. This documentation provides detailed performance data for various stream concurrency levels for video analytics and AI processing.
Key Takeaways
- RTX PRO 6000 WE achieved sub-second average end-to-end latency at 20 concurrent streams, but latency increased significantly at 60 streams.
- DGX H100 SXM, using RT-DETR with a local LLM, showed end-to-end latency increasing from 0.65s (1 stream) to 25.54s (60 streams).
- Grounding DINO with a local LLM on DGX H100 SXM scaled from 0.65s (1 stream) to 7.60s (57 streams), while 40 streams produced no events.
- DGX Spark and AGX Thor were tested with single-stream configurations, achieving 1.25s and 0.91s average end-to-end latency respectively.
- Tail latency (P90, P99) and GPU contention were identified as factors impacting performance, especially at higher stream concurrency levels.
Why It Matters
NVIDIA's detailed performance data for VSS Alert Bridge provides crucial insights for integrators deploying AI-driven video analytics. Understanding latency and throughput across different hardware and model configurations allows for more precise infrastructure sizing and performance expectation management, directly impacting real-time security and operational intelligence applications. This data underscores the performance trade-offs inherent in scaling AI workloads. Moving forward, the industry will watch for further optimizations in multi-stream VLM performance and how these benchmarks translate into practical, large-scale deployments for real-world scenarios.
Additional Context
NVIDIA's VSS (Video Search and Summarization) framework, which includes the Alert Bridge, has been a focus for the company's AI initiatives in video. In a May 2026 NVIDIA blog post by Samuel Ochoa, the company highlighted VSS Metropolis Blueprint's ability to transform video into searchable and actionable intelligence using AI agents and skills. The modular design of VSS 3 was emphasized, with various developer profiles for different workflows like alert verification, video summarization, and search. The blog post also provided key performance metrics for agentic search workflows on H100 and RTX PRO 6000, noting a retrieval latency of 2.24s and 1.87s respectively, and alert verification latency of 0.89s on AGX Thor and 0.82s on RTX PRO 6000. Additionally, the post discussed using coding agents like Codex and OpenClaw with VSS skills to automate deployment and video analysis, illustrating how agents can be used to identify specific events in video footage. This contextualizes the detailed benchmarks, showing how various VSS components fit into broader AI video analytics solutions and the tools available for deployment and interaction.
Read full article at docs.nvidia.com