2. Language
💡 Tips for Best Results:
- Clean Audio: Use a reference audio clip without background noise or music.
- Length: A reference clip of 3 to 10 seconds is usually the sweet spot.
- Language Match: Make sure the selected language matches the text you typed!
- First Run: The very first generation might take a few extra seconds while the models allocate memory.