AutoMQ and Amazon FSx bypass Kafka's cost-latency trade-off with diskless WAL
AutoMQ now supports Amazon FSx for NetApp ONTAP, enabling diskless multi-AZ Apache Kafka deployments on AWS with sub-10ms write latency. This integration eliminates cross-AZ data transfer costs and operational complexity, providing improved performance and scalability for streaming infrastructure. The solution optimizes the write-ahead log (WAL) challenge by using FSx for ONTAP as a shared storage layer, ensuring low latency and multi-AZ durability without traditional replication overhead.
Key Takeaways
- Regional shared storage via FSx for ONTAP enables multi-AZ durability without requiring synchronous broker-to-broker replication.
- Benchmarks show sub-10ms write latency and 28ms end-to-end latency, approaching local disk performance for real-time streaming.
- Decoupled architecture uses Amazon S3 for long-term retention, separating fixed WAL costs from variable storage scaling.
- Stateless brokers support rapid scaling in seconds without the need for time-consuming partition reassignment or data migration.
Why It Matters
Multi-AZ Kafka deployments traditionally force a choice between high-latency object storage or expensive cross-zone data transfer fees. This integration proves that the 'diskless' model is maturing beyond batch use cases into low-latency, mission-critical streaming. For video platforms managing unpredictable traffic spikes, the ability to scale stateless brokers without rebalancing data removes a significant operational bottleneck and cost multiplier. The move underscores a shift toward shared-storage architectures where cloud-native file systems like FSx for ONTAP replace local EBS volumes for write coordination. Watch for whether managed services like Amazon MSK or Confluent Cloud introduce similar regional shared-log options to counter these efficiency gains.
Additional Context
The move to diskless Kafka reflects a broader industry push to reduce 'hidden' cloud networking fees. Per Confluent research from April 2026, cross-AZ traffic frequently accounts for over 50% of the total Kafka bill for large-scale production clusters. This cost, often buried under generic network usage in cloud invoices, stems from Kafka’s native replication requirement to move every byte between availability zones to ensure data durability. By leveraging regional storage services that handle replication at the storage layer, products like AutoMQ and competitors such as WarpStream and Aiven’s 'Inkless' (launched in early 2026) aim to bypass these egress charges entirely while maintaining the resilience required for enterprise workloads. Simultaneously, AWS has aggressively expanded the underlying storage infrastructure that makes these architectures feasible. In April 2026, Amazon announced the regional expansion of second-generation FSx for NetApp ONTAP file systems to four additional regions, including London and Sao Paulo. These updated systems support up to 12 highly available file server pairs, providing throughput speeds of up to 72 GBps. This increased IOPS and throughput performance, combined with integration for AWS Transfer Family reported in early 2026, positions FSx for ONTAP as a critical performance layer for data-heavy streaming applications that cannot tolerate the 150ms+ latencies typical of direct-to-S3 ingestion.
Read full article at community.netapp.com
