I just found this. This is huge! As a german, I use thorsten medium
[https://huggingface.co/csukuangfj/sherpa-onnx-apk/resolve/main/tts-engine-new/1.10.26/sherpa-onnx-1.10.26-arm64-v8a-de-tts-engine-vits-piper-de_DE-thorsten-medium.apk]
as he simply made the best dataset. Mixing english with german, speaking
numbers, single letters, pausing without a “.” but just a linebreak, all those
can be essential. And… it is nearly perfect! And all local! This is crazy!
eSpeak can finally go to rest!
Nice! It all depends on the model.
Thorsten has spoken 23+h of audios into his main dataset, crazy