TTS语音合成综述

Speaker adaptation
If you have very limited data, then you can
consider to try fine-turn pre-trained model. For example, using
pre-trained model on LJSpeech, you can adapt it to data from VCTK
speaker p225 (30 mins) by the following command From my experience, it
can get reasonable speech quality very quickly rather than training
the model from scratch.

所谓voice clone就是,在拿到一个新的没见过speaker的语音之后,只要用户少量的句子(甚至一句), 就可以合成语音来。voice clone包含我们通常用到的adapt和本文新提出的speaker encoding。
最最传统的方式,就是把这些数据加进去微调得到新模型,这也就是clone了。

语音转换技术综述
语音转换(voice conversion)是这样一个任务:输入一条语音,在保持说话内容不变的情况下,让它听起来像是另一个人说的。一个典型的用例,就是柯南的蝴蝶领结变声器。

语音转换的一般过程分为三个步骤:1.特征提取;2.特征转换;3.语音重合成。

[En]

The general process of voice conversion is divided into three steps: 1. Feature extraction; 2. Conversion features; 3. Re-synthesize speech.

Original: https://blog.csdn.net/xys430381_1/article/details/109136036
Author: xys430381_1
Title: TTS语音合成综述

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/526470/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

亲爱的 Coder【最近整理,可免费获取】👉 最新必读书单  | 👏 面试题下载  | 🌎 免费的AI知识星球