Training method and device of voice speech translation model
A speech translation and model training technology, applied in the field of speech translation, can solve problems such as wrong translation results, inaccurate translation results, and translation performance to be improved, and achieve the effect of accurate model parameters and improved translation performance
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
no. 1 example
[0061] It should be noted that, the traditional voice translation method is usually to perform voice recognition on the voice first, recognize it as a text in the same language, and then process the recognized text, that is, use text translation technology to process the recognized text. Translate, translate it into text in another language, and realize voice translation. However, this traditional speech translation method often has the problem of error accumulation, that is, if an error occurs during speech recognition, the error will be accumulated in the subsequent text translation process, resulting in inaccurate translation results.
[0062] Therefore, in order to solve the above-mentioned defects, such as figure 1 The end-to-end speech translation model shown performs speech translation. The speech translation model includes an encoder, an attention layer (Attention) and a decoder. Through this speech translation model, the source language speech can not be speech recogn...
no. 2 example
[0139] The above is a specific embodiment of a speech translation model training method provided in the first embodiment of the present application. Based on the speech translation model trained in the above embodiment, the embodiment of the present application also provides a speech translation method.
[0140] see Figure 8 , which shows a flow chart of a speech translation method provided by an embodiment of the present application, such as Figure 8 As shown, the method includes:
[0141] S801: Obtain a target voice to be translated.
[0142] In this embodiment, any speech translated by this embodiment is defined as the target speech. The language of the target speech is the same as that of the sample speech in the above-mentioned first embodiment.
[0143] It is understandable that the target voice can be obtained through recording according to actual needs. For example, the voice of a telephone conversation in people's daily life, or a recording of a meeting can be us...
no. 3 example
[0148] This embodiment will introduce a training device for a speech translation model. For related content, please refer to the above method embodiments.
[0149] see Figure 9 , which is a schematic diagram of the composition of a speech translation model training device provided in this embodiment, the device 900 includes:
[0150] A training data acquisition unit 901, configured to acquire model training data, the model training data including each sample speech;
[0151] The translation text obtaining unit 902 is configured to use the current speech translation model to directly translate the sample speech to obtain a predicted translation text, wherein the speech translation model shares some model parameters with a speech recognition model;
[0152] A recognition text obtaining unit 903, configured to use the current speech recognition model to recognize the sample speech to obtain a predictive recognition text;
[0153] The model parameter updating unit 904 is config...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com