Personalized speech translation method and device based on speaker features

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A voice translation and speaker technology, applied in the field of voice translation, can solve the problem of not solving the speaker's characteristics and applying a personalized translation system

Active Publication Date: 2020-10-16

SICHUAN CHANGHONG ELECTRIC CO LTD

View PDF16 Cites 7 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0007] The present invention provides a speaker-based personalized speech translation method and device to solve the problem in the prior art that speaker features are not applied to the entire personalized translation system from speaker speech to text to speech

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0043] see Figure 1-3 , a method for personalized speech translation based on speaker characteristics, comprising the following steps:

[0044] Step 1, collecting the speaker's voice, extracting the speaker's voice acoustic feature, and converting it into a speaker feature vector;

[0045] The method for extracting the acoustic features of the speaker's speech is specifically to perform windowed Fourier transformation on the speaker's voice to obtain linear features, and then process the acoustic features of the speaker's speech through a Mel filter.

[0046] The speaker's speech acoustic features extracted by collecting people with different intonation features are input into the deep speech recognition model, and then trained with a deep learning network to obtain the speaker feature vector model corresponding to the speech acoustic features of different speakers.

[0047] The speaker's speech acoustic features extracted by the speaker are input into the speaker feature ve...

Embodiment 2

[0062] In this embodiment, a personalized speech translation device based on speaker features includes a speaker audio feature extraction unit, a speaker speech recognition unit, a translation unit, an encoder unit, and an end-to-end text feature-to-audio feature unit.

[0063] Speaker audio feature extraction unit, which performs windowed Fourier transformation on the speaker's voice to obtain linear features, and then obtains the speaker's voice acoustic features through Mel filter processing, and inputs the target voice acoustic features into the speaker feature vector model to get the speaker feature vector.

[0064] The speaker's speech recognition unit, which recognizes the speech as corresponding text according to the speaker's feature vector combined with the speaker's speech acoustic feature as the neural network input of the text recognition model.

[0065] The translation unit is used to translate the speaker's language into the target language. This unit translatio...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a personalized speech translation method based on speaker features, wherein the personalized speech translation method comprises the following steps: collecting speaker speech,extracting speech acoustic features of the speaker speech, and converting the speech acoustic features into speaker feature vectors; carrying out speaker text recognition by combining the speaker feature vectors with speaker voice acoustic features; translating a text of a speaker into a text of a target language; combining a text code of the target language generated in the previous step with the speaker feature vectors generated in the first step to obtain a target text vector with speaker features; and generating a target voice from the target text vector generated in the previous step through a text-to-voice model. By adding the speaker feature extraction network, different speaker mood tones can be added into the speech recognition and text-to-speech conversion process, and the meaning of the speaker can be more accurately translated. The invention further discloses a personalized speech translation device based on the speaker features.

Description

technical field [0001] The invention relates to the technical field of speech translation, in particular to a speaker-based personalized speech translation method and device. Background technique [0002] With the development of globalization and the increase of exchanges between different countries, the importance of real-time voice translation is increasing. When the tone of the speaker changes in traditional voice translation, it may not be able to express the meaning of the speaker, and different regions have different interpretations. Certain words may have different pronunciations, which highlights the importance of personalized translations. [0003] At the same time, in the process of translation, there may be cases where the translated result is different from the actual application result due to the difference in the speaker's accent and intonation. For example, the message that the speaker wants to express is "Is there a hot dog seller nearby?" After speech recog...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L15/02G10L15/26G06F40/58G10L13/04

CPCG10L15/02G10L15/26G06F40/58G10L13/04

Inventor 周琳岷王昆朱海

Owner SICHUAN CHANGHONG ELECTRIC CO LTD

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Personalized speech translation method and device based on speaker features

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology