Many-to-many speech conversion system based on vae and i-vector under the condition of non-parallel text
A voice conversion, non-parallel technology, applied in the field of signal processing, can solve the problem that the personality similarity of the converted voice is not ideal.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0027] see figure 1 and figure 2 , the present embodiment provides a many-to-many speech conversion system based on VAE and i-vector under non-parallel text conditions, which is divided into two steps of training and conversion:
[0028] 1 speaker speech training stage
[0029] 1.1 Obtain the training corpus. The speech library used here is VCC2018, which contains 8 source speakers and 4 target speakers. The training corpus is divided into two groups: 4 male speakers and 4 female speakers. For each fully trained speaker, 81 sentences are used as training corpus for full training, and 35 sentences are used as test corpus for model evaluation;
[0030] 1.2 Use the speech analysis and synthesis model WORLD to extract the speech features of each frame of the speaker's sentence: spectral envelope sp', speech logarithmic fundamental frequency logf 0 , the harmonic spectrum envelope ap, calculate the energy en of each frame of speech, and recalculate the spectrum envelope, ie sp...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com