Many-to-many speaker conversion method based on DenseNet STARGAN
A conversion method and speaker technology, applied in speech analysis, speech recognition, instruments, etc., can solve problems such as network degradation and gradient disappearance, and achieve the effect of enhancing representation ability, good nonlinear representation ability, and improving extraction ability
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0043] The method principle of the present invention is as figure 1 As shown, the DenseNet network is applied to the STARGAN model, and a 6-layer DenseNet network is constructed in the encoding and decoding stages of the generator to overcome the problem of network degradation of the deep network, reduce the difficulty of learning the semantic features of the encoding network, and realize the STARGAN model The deep semantic features and personality features of the spectrum are fully learned, so as to improve the spectrum generation quality of the decoding network well.
[0044] The specific implementation is divided into two parts: the training part is used to obtain the features and conversion functions required for speech conversion, and the conversion part is used to realize the conversion of the source speaker's voice into the target speaker's voice.
[0045] The implementation steps of the training phase are:
[0046] 1.1) Obtain the training corpus of non-parallel text,...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com