Multi-to-multi speaker conversion method based on STARGAN and x vector
A conversion method and speaker technology, applied in speech analysis, instrumentation, speech synthesis, etc., can solve problems such as the inability to fully express the individual characteristics of the speaker, and the lack of great improvement in the similarity of speech and speech.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0055] Such as figure 1 As shown, the method of the present invention is divided into two parts: the training part is used to obtain the parameters and conversion functions required for voice conversion, and the conversion part is used to convert the source speaker's voice into the target speaker's voice.
[0056] The implementation steps of the training phase are:
[0057] 1.1) Obtain the training corpus of non-parallel text, the training corpus is the corpus of multiple speakers, including the source speaker and the target speaker. The training corpus is taken from the VCC2018 speech corpus. There are 6 male and 6 female speakers in the training set of this corpus, and each speaker has 81 sentences. This method can realize conversion under parallel text, and can also realize conversion under non-parallel text, so these training corpora can also be non-parallel text.
[0058] 1.2) The training corpus uses the WORLD speech analysis / synthesis model to extract the spectral en...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com