Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Voice synthesis related system, method and device and equipment

A technology of speech synthesis and speech interaction system, applied in the field of speech synthesis methods and devices, capable of solving problems such as low speech synthesis quality, achieving the effects of improving user experience, improving speech synthesis quality, and avoiding inaccurate or even wrong pronunciation

Pending Publication Date: 2021-12-31
ALIBABA GRP HLDG LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] This application provides a speech synthesis method to solve the problem of low speech synthesis quality of multilingual text existing in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice synthesis related system, method and device and equipment
  • Voice synthesis related system, method and device and equipment
  • Voice synthesis related system, method and device and equipment

Examples

Experimental program
Comparison scheme
Effect test

no. 1 example

[0117] Please refer to figure 1 , which is a schematic diagram of an embodiment of the voice interaction system of the present application. The voice interaction system provided in this embodiment includes: a server 1 and a smart speaker 2 .

[0118] Server 1 may be a server deployed on a cloud server, or a server dedicated to implementing a voice interaction system, and may be deployed in a data center.

[0119] Smart speaker 2 can be a tool for home consumers to use voice to surf the Internet, such as ordering songs, shopping online, or understanding the weather forecast. It can also control smart home devices, such as opening curtains, setting the refrigerator The water heater heats up, etc.

[0120] Please refer to figure 2 , which is a schematic diagram of the scene of the voice interaction system of the present application. The server 1 and the smart speaker 2 can be connected through the network, for example, the smart speaker 2 can be connected to the Internet thr...

no. 2 example

[0126] In the foregoing embodiments, a voice interaction system is provided. Correspondingly, the present application also provides a voice synthesis method, which may be executed by a device such as a server. The method corresponds to the embodiment of the above system. The parts in this embodiment that are the same as those in the first embodiment will not be described again, please refer to the corresponding parts in the first embodiment.

[0127] Please refer to Figure 4 , which is a schematic flowchart of an embodiment of the speech synthesis method of the present application. In this embodiment, the method includes the following steps:

[0128]Step S101: Generate a second voice data set in the first language of the first user with the first user's timbre according to the first voice data set in the first language of the second user through the cross-language voice conversion algorithm of the first user.

[0129] The second user may be multiple second users other than...

no. 3 example

[0171] In the foregoing embodiments, a speech synthesis method is provided, and correspondingly, the present application also provides a speech synthesis device. The device corresponds to the embodiment of the above-mentioned method. The parts in this embodiment that are the same as those in the second embodiment will not be described again, please refer to the corresponding parts in the second embodiment.

[0172] A kind of speech synthesis device provided by the application comprises:

[0173] The training data generation unit is used to generate a second voice data set in the first language of the first user with the first user's timbre according to the first voice data set in the first language of the second user through a cross-language voice conversion algorithm;

[0174] A speech synthesizer training unit, configured to generate a speech synthesizer for the first user with multilingual capabilities according to the second speech data set and a third speech data set in ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a voice interaction related system, method and device and equipment. The voice synthesis method comprises the following steps: generating a second voice data set of a first language of a first user with a tone of the first user according to a first voice data set of the first language of a second user through a cross-language voice conversion algorithm of the first user; according to a second voice data set and a third voice data set of a second language of the first user, generating a voice synthesizer with a multi-language capability of the first user; and through the voice synthesizer, generating voice synthesis data of the first user corresponding to the first multi-language mixed text. By adopting the processing mode, the speech synthesis quality of the multilingual text can be effectively improved, so that the user experience is improved.

Description

technical field [0001] The present application relates to the technical field of speech synthesis, in particular to speech synthesis methods and devices, online text-to-speech synthesis systems, methods and devices, voice interaction systems, methods and devices, news broadcast systems, methods and devices, and electronic equipment. Background technique [0002] With the rapid development of speech synthesis technology and the increasing popularity of applications, speech synthesis services are expanding rapidly and are increasingly accepted and used by users. With the improvement of users' education level, more and more application scenarios involve multilingual content, especially the mixed reading of Chinese and English is more common. Therefore, there is a demand for multilingual speech synthesis services, which drives the development of related technologies. [0003] A typical multilingual speech synthesis system adopts the following processing method: first, based on ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L13/08G10L13/04G10L15/07
CPCG10L13/086G10L15/07
Inventor 赵胜奎阮忠孝王昊马斌
Owner ALIBABA GRP HLDG LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products