This project is maintained by Shreeram Chandra
Authors : Shreeram Suresh Chandra, Zongyang Du, Berrak Sisman
Speech and Machine learning lab - The University of Texas at Dallas
Submitted to Speaker Odyssey 2024. Codes will released after acceptance.
| FastSpeech2 [1] | VITS [2] | TEMOTTS | |
|---|---|---|---|
| 1. | |||
| 2. | |||
| 3. | |||
| 4. | |||
| 5. | |||
| 6. | |||
| 7. |
| Text | FastSpeech2 [1] | VITS [2] | TEMOTTS | |
|---|---|---|---|---|
| 1. | Blowing out birthday candles makes me feel special! | |||
| 2. | Her heart felt heavy with sorrow. | |||
| 3. | I am feeling sad. | |||
| 4. | I feel joy when I see colourful balloons. | |||
| 5. | I feel like a broken toy discarded and forgotten. | |||
| 6. | I'm about to explode with anger! | |||
| 7. | I'm so angry I can't even breathe. | |||
| 8. | I'm so angry I could spit fire. | |||
| 9. | Playing with toys brings me so much happiness! | |||
| 10. | She felt like a part of her was missing. | |||
| 11. | Singing and dancing make me feel so good. | |||
| 12. | Smiling at others fills me with happiness. | |||
| 13. | Tears welled up in her eyes. | |||
| 14. | This is driving me crazy. | |||
| 15. | Watching a funny movie makes me laugh out loud. |
[1] Yi Ren, Chenxu Hu, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, and Tie-Yan Liu, “Fastspeech 2: Fast and high-quality end-to-end text to speech,” in International Conference on Learning Representations, 2021.
[2] Jaehyeon Kim, Jungil Kong, and Juhee Son, “Conditional variational autoencoder with adversarial learning for end-to-end text-to-speech,” in International Conference on Machine Learning. PMLR, 2021, pp. 5530–5540.