AI & VideoTechnical DevelopmentJune 8, 2026

MLX Port for 24-Language Voice-Clone TTS Reduces Model Size by 73%

Quantized MLX weights for rednote-hilab/dots.tts-soar, a 24-language zero-shot voice-clone TTS model, are now available for Apple Silicon. These int4 weights reduce the model size by 73% at no quality loss, enabling efficient local deployment on Metal-compatible hardware. The new MLX port and quantization code are released under an Apache-2.0 license.

Key Takeaways

`dots.tts-soar`, a 24-language zero-shot voice-clone Text-to-Speech (TTS) model, now has quantized MLX weights.
The int4 weights reduce the model size by 73% to approximately 2.4 GB, down from 9 GB.
MLX port and quantization code are released under an Apache-2.0 license, usable with the `dots-tts-mlx` runtime.
Quality validation on a multilingual check (EN/DE/ES/FR + Hindi) showed no regression in transcription accuracy or voice similarity for int4 and int8 variants compared to bf16.

Why It Matters

The significant reduction in model size for a 24-language voice-clone TTS model enhances accessibility and reduces computational overhead, particularly for local deployment on Apple Silicon. This development lowers the barrier to entry for high-quality, multilingual AI voice generation in applications such as dubbing, content creation, and accessibility tools. The industry will be watching how this improved efficiency translates into wider adoption and new use cases for on-device TTS capabilities.

Additional Context

The availability of MLX-optimized TTS models like `dots.tts-mlx` highlights a growing trend in local AI inference on Apple Silicon. Several other projects are also leveraging MLX for speech synthesis and processing. For instance, `appautomaton/mlx-speech` provides local speech synthesis on Apple Silicon for TTS, voice cloning, and dialogue, supporting various models like MossTTSLocal and VibeVoice (GitHub, March 2026). Similarly, `louiscoetzee/mlx-tts-studio` is a native macOS app for high-quality TTS using Qwen3-TTS models, offering voice cloning and design features entirely on-device (GitHub, February 2026). Furthermore, the `mlx-tts-server` project offers an OpenAI-compatible Text-to-Speech server for Apple Silicon, powered by Qwen3-TTS and MLX, indicating a move towards standardized API access for local models (PyPI, March 2026). These parallel developments underscore increased developer focus on maximizing the on-device AI capabilities of Apple hardware, reducing reliance on cloud-based solutions and potentially improving privacy and latency for real-time streaming applications.

Read full article at huggingface.co

Get this in your inbox → Subscribe

Enjoy our coverage?

Add StreamingMeme as a preferred source on Google to see more of our streaming news at the top of your Search results.

Add as preferred source

MarkTechPost: Induction Labs Photon-1 trains on 18 years of raw video

YouTube: NTT's LLMlet enables distributed LLM inference across browsers via WebRTC

MarkTechPost: Reactor releases 1.6B parameter open-source Dreamer 4 world-model implementation

MLX Port for 24-Language Voice-Clone TTS Reduces Model Size by 73%

Key Takeaways

`dots.tts-soar`, a 24-language zero-shot voice-clone Text-to-Speech (TTS) model, now has quantized MLX weights.
The int4 weights reduce the model size by 73% to approximately 2.4 GB, down from 9 GB.
MLX port and quantization code are released under an Apache-2.0 license, usable with the `dots-tts-mlx` runtime.
Quality validation on a multilingual check (EN/DE/ES/FR + Hindi) showed no regression in transcription accuracy or voice similarity for int4 and int8 variants compared to bf16.

Why It Matters

Additional Context

Read full article at huggingface.co

MLX Port for 24-Language Voice-Clone TTS Reduces Model Size by 73%

Key Takeaways

Why It Matters

Additional Context

Enjoy our coverage?

Related Articles

MLX Port for 24-Language Voice-Clone TTS Reduces Model Size by 73%

Key Takeaways

Why It Matters

Additional Context

Enjoy our coverage?

Related Articles

Newest

Upcoming Events

Top Sources

Newest

Upcoming Events

Top Sources

Related Articles

Induction Labs Photon-1 trains on 18 years of raw video

NTT's LLMlet enables distributed LLM inference across browsers via WebRTC

Reactor releases 1.6B parameter open-source Dreamer 4 world-model implementation