STARWGAN-GP and x vector-based many-to-many speaker conversion method
A conversion method and speaker technology, applied in neural learning methods, speech analysis, instruments, etc., can solve problems such as GAN training instability and gradient disappearance
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0054] Such as figure 1 As shown, the high-quality speech conversion method of the present invention is divided into two parts: the training part is used to obtain the parameters and conversion functions required for speech conversion, and the conversion part is used to convert the source speakers voice into the target speakers voice.
[0055] The implementation steps of the training phase are:
[0056] 1.1) Obtain the training corpus of non-parallel text, the training corpus is the corpus of multiple speakers, including the source speaker and the target speaker. The training corpus is taken from the VCC2018 speech corpus. There are 6 male and 6 female speakers in the training set of this corpus, and each speaker has 81 sentence corpus. This method can realize conversion under parallel text, and can also realize conversion under non-parallel text, so these training corpora can also be non-parallel text.
[0057] 1.2) The training corpus uses the WORLD speech analysis / synthe...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com