Low-latency HTTP streaming protocols enable sub-second video delivery at scale
This technical explainer by CDN77 details the configuration and protocol optimizations required to achieve sub-5-second or sub-1-second latency using HLS and MPEG-DASH. It highlights key trade-offs in segment duration, transfer encoding, manifest index fetching, and playback buffers.
Key Takeaways
- Ultra-low latency targets sub-1-second delivery, compared to standard 20-40 second HTTP streaming delays
- Chunk transfer encoding enables clients to play segments while they are still being uploaded to the origin
- MPEG-DASH segmentation templates allow clients to calculate segment URLs locally, reducing manifest fetch times to roughly 10ms
- Apple’s LL-HLS remained in draft status as of late 2024, utilizing separate partial segments rather than standard chunked transfers
- Infrastructure hurdles persist as standard storage like AWS S3 doesn't natively support uploads with unknown file sizes outside specific high-performance zones
Why It Matters
Achieving sub-second latency over standard HTTP infrastructure is critical for the profitability of interactive streaming sectors like sports betting and social commerce. This approach allows operators to avoid the high costs and limited scalability of WebRTC by repurposing existing CDN delivery stacks. However, the divergence between Apple’s LL-HLS and the broader industry’s reliance on CMAF-based chunked transfers creates a fragmented development environment. Engineers must now balance the quality trade-offs of shortened segments against the overhead of frequent manifest updates. Watch for wider adoption of high-performance object storage classes to address the 'unknown file size' limitation of traditional cloud origins.
Additional Context
The industry's push toward lower latency has seen significant infrastructure and protocol movement over the last 18 months. Per AWS in late 2024, the introduction of S3 Express One Zone addressed a major technical bottleneck by allowing data to be appended to existing objects. This capability specifically aids media-streaming workflows by supporting high-frequency ingestion and lower request latency, which was previously a barrier for standard object storage in low-latency pipelines. Simultaneously, the standardization of Apple’s LL-HLS continues through incremental updates to the IETF drafts. Per IETF documentation from November 2024, the 'draft-pantos-hls-rfc8216bis' is in its 16th revision, reinforcing features like playlist delta updates and preload hints to manage the increased request volume inherent in short-segment streaming. While LL-HLS provides a native path for the Apple ecosystem, many providers still utilize DASH for non-Apple platforms due to its mature support for chunked transfer encoding (CTE). Practical performance benchmarks from Gcore in early 2025 demonstrated that combining LL-HLS and LL-DASH can yield glass-to-glass delays of 2.2 to 3.0 seconds in real-world CDN environments. This performance level is increasingly necessary as major broadcasters migrate live sports to digital-only platforms. Recent reporting from Ant Media in April 2026 suggests that while WebRTC remains the standard for sub-500ms interactivity, the 3-5 second range achieved by tuned DASH and HLS is becoming the benchmark for broadcast-grade OTT services.
Read full article at cdn77.com
