Chinese-Vietnamese Jointly Training Neural Machine Translation Method Based on Pivot

A machine translation and neural technology, applied in natural language translation, instruments, network data indexing, etc., can solve the problems of poor performance of Chinese-Vietnamese machine translation, the impact of scale and quality, and low translation quality, so as to improve the Chinese-Vietnamese neural machine. Translate performance, fix poor performance effects

Active Publication Date: 2022-06-21
KUNMING UNIV OF SCI & TECH
View PDF9 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The invention provides a pivot-based Chinese-Vietnamese joint training neural machine translation method to solve the problem that the translation quality of neural machine translation is lower than that of statistical machine translation in low-resource scenarios; it solves the problem of low resources in Chinese-Vietnamese In terms of language pairing, the scale and quality of the Chinese-Vietnamese parallel corpus lead to poor performance of Chinese-Vietnamese machine translation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese-Vietnamese Jointly Training Neural Machine Translation Method Based on Pivot
  • Chinese-Vietnamese Jointly Training Neural Machine Translation Method Based on Pivot
  • Chinese-Vietnamese Jointly Training Neural Machine Translation Method Based on Pivot

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0061] Example 1: as Figure 1-4 As shown, the Chinese-Vietnamese joint training neural machine translation method based on the pivot, the specific steps of the method are as follows:

[0062] Step1. Obtain Chinese, English and Vietnamese monolingual corpus, and then construct Chinese-English parallel corpus, English-Vietnamese parallel corpus and Chinese-Vietnamese parallel after filtering, denoising, removing stop words, named entity recognition and labeling, and word segmentation preprocessing. corpus;

[0063] Step 2. Pivot-based neural machine translation, the neural machine translation incorporating the attention mechanism first encodes the source language sentence into a vector sequence, and then decodes it to generate the target language; using the existing source language-pivot language and pivot language-target Parallel corpora of languages ​​to train source-to-pivot and pivot-to-target translation models respectively;

[0064] Step3. The Chinese-Vietnamese joint t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a pivot-based Chinese-Vietnamese joint training neural machine translation method, belonging to the technical field of natural language processing. The present invention includes the steps: firstly, the method uses Chinese-Vietnamese parallel corpus to train the translation model to obtain the word vector representation of Chinese and Vietnamese; secondly, the Chinese-English and English-Vietnamese translation models are jointly trained with English as the pivot language, and then The Chinese-Vietnamese vector representation of the Chinese-English, English-Vietnamese translation model and the Chinese-Vietnamese vector representation obtained by the Chinese-Vietnamese model are optimized for Chinese-Vietnamese joint training. The present invention combines the Chinese-Vietnamese parallel corpus with the Chinese-English and English-Vietnamese parallel corpus for joint training, making full use of the English pivot corpus to improve the performance of Chinese-Vietnamese machine translation, thus solving the problem of poor translation model performance caused by the lack of Chinese-Vietnamese parallel corpus The problem.

Description

technical field [0001] The invention relates to a Chinese-Vietnamese joint training neural machine translation method based on a pivot axis, and belongs to the technical field of natural language processing. Background technique [0002] Machine translation is an effective tool for large-scale language translation. In recent years, the exchanges and cooperation between China and Vietnam have become increasingly close, and machine translation is a more effective way to exchange information across languages. Therefore, it is very important to study Chinese-Vietnamese machine translation. application value. [0003] Neural machine translation is a machine translation method proposed in 2014, and the current mainstream neural machine translation models all use an encoder-decoder architecture. Neural machine translation has achieved good translation performance on language pairs with massively parallel corpora, but in low-resource scenarios, the translation quality of neural mac...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/58G06F40/44G06F40/295G06F40/284G06F40/205G06F16/951
CPCG06F40/58G06F40/44G06F40/295G06F40/284G06F40/205G06F16/951
Inventor 高盛祥张磊余正涛王振晗朱俊国刘畅
Owner KUNMING UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products