Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Automatic editing-after-translating system and method for multisource neural network based on splicing-remixing mode

A post-translation editing and neural network technology, which is applied in the fields of natural language processing and machine translation, can solve problems such as missing translations, improve the overall quality, improve translation fidelity, and improve the overall translation quality

Active Publication Date: 2017-10-27
BEIJING INSTITUTE OF TECHNOLOGYGY
View PDF2 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to propose a multi-source neural network post-translation editing system and method based on splicing and remixing in order to solve the problem of a large number of missing translations in the existing neural network post-translation editing process

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Automatic editing-after-translating system and method for multisource neural network based on splicing-remixing mode
  • Automatic editing-after-translating system and method for multisource neural network based on splicing-remixing mode

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0059] This embodiment combines the attached figure 1 , describes the detailed composition, training and decoding process of a multi-source neural network post-translation editing system and method based on splicing and remixing in the present invention.

[0060] From figure 1 It can be seen that the training module is connected to the decoding module.

[0061] The training process of the training module includes the following steps:

[0062] Step A: collecting various corpus required for the training process of the system;

[0063] Wherein, each corpus mainly includes the training original text corpus and the reference translation corpus; wherein, the training original text corpus and the reference translation corpus are parallel corpora; assuming N=600000, that is, the training original text has 60,000 sentences;

[0064] Training original corpus, recorded as: {source 1 , source 2 ,..., source 600000},

[0065] Training translation corpus, denoted as {ref 1 , ref 2 ...

Embodiment 2

[0094] In this embodiment, a specific sentence is taken as an example to illustrate the effect of the system and method.

[0095] In a specific example, the quality of translation is intuitively reflected in fidelity and fluency, and the improvement of fidelity is subdivided into the improvement of word selection accuracy.

[0096] Suppose the translation of the original text is "However, the challenges in the past were not limited to funding public housing, private housing was also full of major tests." sentence.

[0097] The preliminary machine translation system uses the Moses statistical machine translation system, and the translation result is "however, the pastchallenge, not in the funding of public housing, private housing is full of challenge." In this sentence, the keyword "funding" in the original translation is replaced by Translated into "funding", which means "providing funds for...", lacks the meaning of help level, and is not accurate enough. At the same time, t...

Embodiment 3

[0101] This example illustrates in a statistical sense that this system and method are compared with the single-source neural network automatic post-translation editing system that does not add the original translation and directly uses the preliminary translation results as the source language training, and the multi-source method that only splices but does not mix. Advantages of neural network automatic post-translation editing systems in terms of overall translation quality.

[0102] Assume that there are 600,000 sentences in the training original text and reference translation data sets used for the training module, and 1597 sentences in the original translation data set used in the test module. The preliminary machine translation system uses the Moses statistical machine translation system, and the scoring uses the multi-bleu script. The BLEU value Represents the overall translation quality, and the one-to-four scores are quantitative indicators of fidelity and fluency, re...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an automatic editing-after-translating system and method for a multisource neural network based on a splicing-remixing mode and belongs to the technical fields of computer natural language processing and machine translation. According to the method, the system is included, and a training module and a decoding module are also included. The method is divided into a training process and a decoding process. The training system process is established on the basis of a traditional neural network machine translation model, wherein a source corpus is replaced with a new corpus generated after a translation source text and a preliminary translation result are subjected to simple statement splicing and remixing, a target corpus is replaced with a reference translation which is doubled, and the preliminary translation result and the translation source text are made to assist each other in the training process to realize cross verification. In the translation decoding process, the system obtained through training can be directly used to decode the source corpus obtained after the translation source text and the preliminary translation result are correspondingly spliced, and the obtained translation is better than the preliminary translation result not subjected to the editing-after-translating method in fluency, accuracy and overall quality.

Description

technical field [0001] The invention relates to a multi-source neural network post-translation editing system and method based on splicing and remixing, and belongs to the technical fields of computer application, natural language processing and machine translation. technical background [0002] In recent years, with the advancement of the wave of globalization, international exchanges have become increasingly frequent, and the demand for translation services in all walks of life has become more urgent. Although machine translation has the advantage of being more efficient and convenient, there is still a big gap between its translation and human translation. Therefore, automatic post-editing of machine translation results to improve translation quality has important practical value. [0003] The neural network automatic post-translation editing system is an improvement on the traditional automatic post-translation editing. It is good at generating sentences with high fluen...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/28
CPCG06F40/58
Inventor 郭宇航黄河燕曹倩雯
Owner BEIJING INSTITUTE OF TECHNOLOGYGY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products