Pivot-based Chinese-Vietnamese joint training neural machine translation method

A machine translation and neural technology, applied in natural language translation, instruments, special data processing applications, etc., can solve the problems of low translation quality, poor Chinese-Vietnamese machine translation performance, scale and quality impact, etc., and achieve the goal of improving translation performance. Effect

Active Publication Date: 2021-01-22
KUNMING UNIV OF SCI & TECH
View PDF9 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The invention provides a pivot-based Chinese-Vietnamese joint training neural machine translation method to solve the problem that the translation quality of neural machine translation is lower than that of statistical machine translation in low-resource scenarios; it solves the problem of low resources in Chinese-Vietnamese In terms of language pairing, the scale and quality of the Chinese-Vietnamese parallel corpus lead to poor performance of Chinese-Vietnamese machine translation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Pivot-based Chinese-Vietnamese joint training neural machine translation method
  • Pivot-based Chinese-Vietnamese joint training neural machine translation method
  • Pivot-based Chinese-Vietnamese joint training neural machine translation method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0061] Embodiment 1: as Figure 1-4 Shown, based on the Chinese-Vietnamese joint training neural machine translation method based on the pivot, the specific steps of the method are as follows:

[0062] Step1. Obtain Chinese, English, and Vietnamese monolingual corpora, and then construct Chinese-English parallel corpus, English-Vietnamese parallel corpus, and Chinese-Vietnamese parallel corpus after filtering, denoising, removing stop words, named entity recognition and labeling, and word segmentation preprocessing corpus;

[0063] Step2, pivot-based neural machine translation, neural machine translation that incorporates the attention mechanism first encodes the source language sentence into a vector sequence, and then generates the target language after decoding; use the existing source language-pivot language and pivot language-target Parallel corpus of languages ​​to train the translation models from the source language to the pivot language and from the pivot language to...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a pivot-based Chinese-Vietnamese joint training neural machine translation method, and belongs to the technical field of natural language processing. The method comprises thesteps that firstly, Chinese and Vietnamese word vector representations are obtained by training a translation model by Chinese-Vietnamese parallel corpora; secondly, English as a pivot language jointly trains Chinese-English and English-Vietnamese translation models, and calculating and optimizing vector representations of Chinese and Vietnamese of Chinese-English, English-Vietnamese translation models and vector representations of Chinese and Vietnamese acquired by a Chinese-Vietnamese model to joint train Chinese and Vietnamese. A Chinese-Vietnamese corpus, a Chinese-English corpus and an English-Vietnamese corpus are jointly trained. An English pivot corpus is fully utilized to improve the Chinese-English machine translation performance, so that the problem of poor translation model performance caused by missing of the Chinese-English parallel corpus is solved.

Description

technical field [0001] The invention relates to a pivot-based Chinese-Vietnamese joint training neural machine translation method, belonging to the technical field of natural language processing. Background technique [0002] Machine translation is an effective tool for large-scale language translation. In recent years, the exchanges and cooperation between China and Vietnam have become closer and closer, and machine translation is a more effective way of cross-language information exchange. Therefore, it is very important to study Chinese-Vietnamese machine translation. application value. [0003] Neural machine translation is a machine translation method proposed in 2014. At present, the mainstream neural machine translation models all adopt the encoder-decoder architecture. Neural machine translation has achieved good translation performance on language pairs with large-scale parallel corpora, but in low-resource scenarios, the translation quality of neural machine trans...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F40/58G06F40/44G06F40/295G06F40/284G06F40/205G06F16/951
CPCG06F40/58G06F40/44G06F40/295G06F40/284G06F40/205G06F16/951
Inventor 高盛祥张磊余正涛王振晗朱俊国刘畅
Owner KUNMING UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products