CosyVoice is a powerful Text-to-Speech model that generates expressive and natural-sounding speech from text, using a reference audio to mimic voice characteristics.
Configure your audio generation parameters
Describe the text you want to convert to speech
Upload an audio file to use as a reference for voice characteristics
No audio generated yet
Enter a prompt, upload reference audio, and click generate to create your first speech