Speech synthesis method and device, equipment and storage medium

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of speech synthesis and speech, which is applied in the field of computer equipment, storage media, devices, and speech synthesis methods, can solve problems such as poor user experience and low fitting degree, and achieve the effect of improving user experience

Pending Publication Date: 2021-09-03

PING AN TECH (SHENZHEN) CO LTD

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] The technical problem to be solved by the present invention is that the speech synthesized by the current speech synthesis technology has a low degree of fitting to the real human voice, and the user experience is poor.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0031] see figure 1 , figure 1 It is a schematic flowchart of a speech synthesis method disclosed in an embodiment of the present invention. Such as figure 1 As shown, the speech synthesis method may include the following operations:

[0032] 101. Input the reference speech sequence into a preset speech prosody analysis model for analysis to obtain speech prosody feature information.

[0033] In the above step 101, the reference speech sequence may be the speech to which the speech that the user wants to synthesize refers to. For example, if the user wants to make the synthesized voice more suitable for the voice of human A, he can convert a real voice of human A speaking into a reference voice sequence. The prosody of speech includes the intensity, pitch, duration, and pitch of the speech, and the prosody of the speech of different speakers usually has certain differences. The speech prosody analysis model analyzes the reference speech sequence, and the speech prosody fe...

Embodiment 2

[0066] see figure 2 , figure 2 It is a structural schematic diagram of a speech synthesis device disclosed in an embodiment of the present invention. Such as figure 2 As shown, the speech synthesis device may include:

[0067] The speech prosody analysis module 201 is used for inputting the reference speech sequence to a preset speech prosody analysis model for analysis to obtain speech prosody feature information;

[0068] The text prosody analysis module 202 is used for inputting the target text sequence into a preset text prosody analysis model for analysis to obtain text prosody feature information;

[0069] A merge processing module 203, configured to perform preset merge processing on the speech prosody feature information and the text prosody feature information, to obtain prosody information for recording the prosody of the target speech to be synthesized;

[0070] A speech synthesis module 204, configured to synthesize the target speech based on the target text...

Embodiment 3

[0086] see image 3 , image 3 It is a schematic structural diagram of a computer device disclosed in an embodiment of the present invention. Such as image 3 As shown, the computer equipment may include:

[0087] A memory 301 storing executable program codes;

[0088] A processor 302 connected to the memory 301;

[0089] The processor 302 invokes the executable program code stored in the memory 301 to execute the steps in the speech synthesis method disclosed in Embodiment 1 of the present invention.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a speech synthesis method. The method comprises the following steps: inputting a reference speech sequence into a preset speech rhythm analysis model for analysis to obtain speech rhythm feature information; inputting the target text sequence into a preset text rhythm analysis model for analysis to obtain text rhythm feature information; performing preset merging processing on the speech rhythm feature information and the text rhythm feature information to obtain rhythm information used for recording the rhythm of a target speech to be synthesized; and synthesizing the target speech based on the target text sequence and the rhythm information. Therefore, themethod can combine the speech rhythm of the referencespeech and the text rhythm of the target text to synthesize the speech when speech synthesis is performed so that the synthesized speech is closer to the real voice of human beings, and the user experience is improved. The invention also relates to the technical field of block chains.

Description

technical field [0001] The invention relates to the technical field of speech synthesis, in particular to a speech synthesis method, device, computer equipment and storage medium. Background technique [0002] With the development of computer technology, speech synthesis technology has developed into a mature technology, which is widely used in real life, such as intelligent customer service, mobile phone voice assistant, map navigation and so on. However, what follows is that users have higher and higher expectations for speech synthesis technology. At present, users are mainly concerned about whether the synthesized voice is close enough to the real human voice, and whether it sounds natural and realistic enough. Traditional speech synthesis technology mainly focuses on how to convert text sequences into speech sequences, and pays less attention to whether the rhythm of the converted speech sequences is appropriate. Due to the lack of control over the rhythm of synthesiz...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L13/10

CPCG10L13/10

Inventor 张旭龙王健宗

Owner PING AN TECH (SHENZHEN) CO LTD

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Speech synthesis method and device, equipment and storage medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology