Text-to-speech conversion method and device, electronic device and storage medium

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A text and speech technology, applied in speech synthesis, speech analysis, instruments, etc., can solve problems such as inability to separate words, great impact on results, and errors in word segmentation, so as to ensure accuracy, improve accuracy, and improve correctness. rate effect

Inactive Publication Date: 2019-02-12

MOBVOI INC

View PDF4 Cites 6 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] However, during the actual implementation process, the inventor found that word segmentation in the prior art is prone to errors. On the one hand, it is impossible to separate words that are not registered in the dictionary. Such words appear in many occasions such as personal names and place names, and the wrong results are correct. Great influence; on the other hand, consecutive numbers may be separated, because the combination of numbers is endless, and it is impossible to write them all into the dictionary

The error of the word segmentation system will affect the synthesis effect of the entire TTS system

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0035] figure 1 It is a flow chart of a text-to-speech method provided by Embodiment 1 of the present invention. The technical solution of this embodiment is applicable to the case of text-to-speech, and the method can be implemented by a text-to-speech device, which can be implemented by hardware and / or software, the method of text-to-speech specifically includes:

[0036] Step 110, acquiring a preset text normalization template matching the text to be processed.

[0037] Among them, the user can input text through an input device, or obtain text through Optical Character Recognition (OCR), and match the obtained text to be processed with a preset text normalization template, and in all preset text normalization templates Search for preset text normalization templates that match the text to be processed.

[0038] Step 120: Perform text normalization processing on the text to be processed according to the matching preset text normalization template to obtain normalized text....

Embodiment 2

[0048] figure 2 It is a flowchart of a text-to-speech method provided by Embodiment 2 of the present invention. The technical solution of this embodiment is further refined on the basis of the above technical solution. The method includes:

[0049] Step 210, storing pre-generated preset text normalization templates in the text normalization template library.

[0050] Among them, the preset text normalization template can be established by the designer, and then the preset text normalization template can be stored in the text normalization template library, and the preset text normalization template stored in the text normalization template library can be added and deleted and update.

[0051] Step 220: Store pre-segmentation templates in the text normalization template library, and establish a correspondence relationship between preset text normalization templates and pre-segmentation templates.

[0052] Among them, the designer can establish a pre-segmentation template for t...

Embodiment 3

[0065] image 3 It is a schematic structural diagram of a text-to-speech device provided in Embodiment 3 of the present invention. The text-to-speech device 300 includes:

[0066] A preset text normalization template acquisition module 310, configured to acquire a preset text normalization template matching the text to be processed;

[0067] A normalized text determination module 320, configured to perform text normalization processing on the text to be processed according to the matched preset text normalization template to obtain the normalized text;

[0068] The pre-segmentation information adding module 330 is used to add pre-segmentation information in the normalized text according to the pre-segmentation template corresponding to the preset text normalization template;

[0069] The word segmentation text determination module 340 is used to carry out word segmentation to the normalized text according to the pre-segmentation information and the word segmentation model, an...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a text-to-speech conversion method and device, an electronic device and a storage medium. The method comprises steps that a preset text normalization template matching a to-be-processed text is obtained; text normalization processing of the to-be-processed text is performed according to the matched preset text normalization template to obtain a normalization text; accordingto a word pre-segmentation template corresponding to the preset text normalization template, the word pre-segmentation information is added to the normalization text; the normalization text is segmented according to the word pre-segmentation information and a word segmentation model to obtain a word segmentation text; the word segmentation text is converted to a voice message. The method is advantaged in that word segmentation accuracy is improved, and accuracy of the synthesized speech is guaranteed.

Description

technical field [0001] Embodiments of the present invention relate to the technical field of natural language, and in particular, relate to a text-to-speech method, device, electronic device, and storage medium. Background technique [0002] Text to Speech (TTS) is a technology that converts text into human natural language. It is widely used in car navigation broadcast, online customer service of merchants, and intelligent robot language interaction. [0003] The TTS front-end includes text normalization (Text Normalize, TN) for the input text, word segmentation, word transcription, part-of-speech prediction and pause prediction. The accuracy of word segmentation will directly affect the accuracy of subsequent phonetic transcription, part of speech, and pause prediction, thus reflecting the naturalness of the final synthesis. The word segmentation scheme in the prior art requires a dictionary and labeled training corpus. During training, the frequency of the current word ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L13/08

CPCG10L13/08

Inventor 张征张冉

Owner MOBVOI INC

Who we serve

R&D Engineer
R&D Manager
IP Professional

Why Patsnap Eureka

Industry Leading Data Capabilities
Powerful AI technology
Patent DNA Extraction

Social media

Patsnap Eureka Blog

Learn More

PatSnap group products

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Text-to-speech conversion method and device, electronic device and storage medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology