Apple Introduces New Immersive Video and Spatial Audio Formats for Vision Pro
Apple has introduced Apple Immersive Video (AIV) and Apple Spatial Audio Format (ASAF), new media formats for visionOS, designed for adaptive, high-quality stereoscopic experiences on Apple Vision Pro. These technologies merge advanced capture, metadata-driven precision, and tailored audio into a unified workflow, impacting both production and delivery. AIV focuses on high-fidelity visual acuity and a wide field of view, while ASAF unifies object, channel, and HOA audio for spatial and perceptual demands.
Key Takeaways
- AIV captures stereoscopic video at 90 frames per second and greater than 50 megapixels per eye, maintaining world-scale accuracy.
- AIV delivers a peripheral field of view up to 230 degrees, optimizing pixel utilization within natural human viewing ranges.
- ASAF combines object-based, channel-based, and Higher Order Ambisonics (HOA) audio for adaptable, precise spatial sound.
- Metadata-driven workflows for AIV and ASAF are designed to streamline production from capture to delivery on visionOS.
- APAC (Apple Positional Audio Codec) was developed to efficiently deliver high-resolution ASAF content at practical bitrates starting at 64 kilobits per second.
Why It Matters
These new formats directly address the technical demands of high-fidelity immersive content for the Apple Vision Pro, aiming to simplify complex production workflows. By integrating advanced capture and precise metadata, Apple is setting a standard for spatial computing media. The focus on high pixel density, wide field of view, and responsive spatial audio indicates a commitment to an ecosystem play for VR/AR content. Moving forward, observe adoption rates by major content producers and the emergence of third-party tools supporting AIV and ASAF, particularly how these standards integrate into existing broadcast and post-production pipelines for broader industry impact.
Additional Context
Apple's push into immersive video formats is supported by developments in its professional software and hardware ecosystem. Blackmagic Design, for instance, has integrated full Apple Immersive Video workflows into its DaVinci Resolve Studio 20, including support for the Blackmagic URSA Cine Immersive camera. This camera features dual 8K sensors capturing at 90 frames per second (CineD, June 2025). DaVinci Resolve Studio 20 offers tools like an Immersive viewer for 2D presentation, live preview on Apple Vision Pro, and integrated Apple Spatial Audio Format mixing. It also handles lens metadata preservation and provides a visionOS export preset. Separately, Apple's Compressor 5.0 can transcode various stereoscopic 3D source videos into spatial video for editing in Final Cut Pro or direct playback on Apple Vision Pro (Apple Support). MV-HEVC spatial video encoding requires Apple silicon Macs and macOS 14+, indicating a strategic alignment with Apple's hardware platforms for optimized performance. The Immersive Media Support framework in macOS and visionOS 26 provides APIs for reading and writing necessary metadata for AIV, and enables previewing content during editorial workflows (WWDC25). These developments highlight Apple's comprehensive approach to building an end-to-end production and delivery pipeline for spatial content, from acquisition to final consumption on the Vision Pro.
Read full article at developer.apple.com
