Loading...
Discovering amazing AI tools

Voicebox is a free, open-source, local-first AI voice studio for cloning voices, generating speech in 23 languages, and dictating anywhere.
Voicebox is a free, open-source, local-first AI voice studio for cloning voices, generating speech in 23 languages, and dictating anywhere.
Voicebox is a local-first AI voice studio—a free and open-source alternative to ElevenLabs and WisprFlow combined in one app. It can clone a voice from a few seconds of audio, generate speech in 23 languages across seven TTS engines, dictate into any text field with a global hotkey, and give any MCP-aware AI agent a voice of your choosing. Dictation works by holding a customizable key chord anywhere on your machine, with a floating on-screen pill walking through recording, transcribing, refining, and done, while every capture is preserved with its transcript in the Captures tab. The whole pipeline runs on your machine: OpenAI Whisper handles transcription and a bundled local LLM refines output, running on MLX for Apple Silicon or PyTorch for CUDA, ROCm, DirectML, or CPU, and a REST API exposes voice I/O to your own apps.
Browse by use case: Voice & Audio
Compare Voicebox: vs OpenArt Director · vs World Monitor · vs Alai 2.0 · vs Backgrind