Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Transfer learning-based Chinese-Vietnamese neural machine translation method

A technology of machine translation and transfer learning, applied in the field of natural language processing, which can solve the problems of poor Chinese-Vietnamese neural machine translation, achieve good decoding effect and improve performance

Active Publication Date: 2019-11-19
KUNMING UNIV OF SCI & TECH
View PDF11 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The present invention provides a method of Chinese-Vietnamese neural machine translation based on transfer learning to solve the problem of poor Chinese-Vietnamese neural machine translation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Transfer learning-based Chinese-Vietnamese neural machine translation method
  • Transfer learning-based Chinese-Vietnamese neural machine translation method
  • Transfer learning-based Chinese-Vietnamese neural machine translation method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0030] Embodiment 1: as Figure 1-2 As shown, the method of Chinese-Vietnamese neural machine translation based on transfer learning, the specific steps of the method are as follows:

[0031] Step1. Use crawlers to crawl the training corpus. The crawled training corpus includes Chinese-Vietnamese corpus with a scale of 100,000 sentence pairs; English-Vietnamese corpus with a scale of 700,000 sentence pairs; Chinese-English data with a scale of 50 million sentence pairs; Manually screen and then filter garbled characters; and extract a part from the training data as the test set and verification set;

[0032] The crawled corpus is manually screened and then word-segmented, Arabic numerals are replaced with "num" and garbled characters are filtered.

[0033] Step2. In the existing Chinese-English and English-Vietnamese data sets, use the back-translation method for the axis language English. First, use a 4-layer neural machine translation system based on the attention mechanism...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a transfer learning-based Chinese-Vietnamese neural machine translation method, and belongs to the technical field of natural language processing. The method comprises the following steps of: corpus collection and preprocessing: collecting and preprocessing parallel corpora of Chinese-Vietnamese sentence pairs, English-Vietnamese sentence pairs and Chinese-English sentencepairs; generating Chinese-English-Vietnamese trilingual parallel corpora by using the Chinese-English and English-Vietnamese parallel corpora; training a Chinese-English neural machine translation model and an English-Vietnamese neural machine translation model, and initializing parameters of the Chinese-Vietnamese neural machine translation model by using parameters of the pre-trained model; andcarrying out fine adjustment training on the initialized Chinese-Vietnamese neural machine translation model by using the Chinese-Vietnamese parallel corpus to obtain the Chinese-Vietnamese neural machine translation model to carry out Chinese-Vietnamese neural machine translation. According to the method, the performance of the Chinese-Vietnamese neural machine translation can be effectively improved.

Description

technical field [0001] The invention relates to a method for Chinese-Vietnamese neural machine translation based on transfer learning, and belongs to the technical field of natural language processing. Background technique [0002] In recent years, exchanges between China and Vietnam have become increasingly frequent, and the demand for translation technology in low-resource scenarios such as Chinese-Vietnamese is growing. However, the current Chinese-Vietnamese neural machine translation performance is not ideal, so improving the performance of the Chinese-Vietnamese neural machine translation system has played a very important role in the communication between the two countries. End-to-end neural machine translation (Neural Machine Translation) is a brand-new translation system that directly uses neural networks to map source language texts to target language texts. Neural machine translation has achieved very good translation performance on resource-rich language pairs, ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/28G06N3/08
CPCG06N3/08
Inventor 余正涛黄继豪郭军军文永华高盛祥王振晗
Owner KUNMING UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products