Training sample enhancement method and device, computer equipment and storage medium

A technology for training samples and enhancing devices, applied in the field of machine translation, which can solve problems such as heavy workload, increased difficulty in machine learning, and inflexibility

Pending Publication Date: 2021-09-14
BEIJING IQIYI TECH CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

It can be seen that although this method increases the number of training samples, it also increases the error noise
In fact, the data added by word replacement and back-translation are fake data. While increasing the number of...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Training sample enhancement method and device, computer equipment and storage medium
  • Training sample enhancement method and device, computer equipment and storage medium
  • Training sample enhancement method and device, computer equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] In order to make the objects, technical solutions, and advantages of the present application, the technical solutions in the present application embodiment will be clearly described, and the described embodiments will be described in conjunction with the drawings in the present application embodiment. It is part of the embodiments of the present application, not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without making creative labor are the scope of the present application.

[0032] A first aspect, a method of enhancing training sample provided by the embodiment of the present application, such as figure 1 As shown, the method includes the following steps:

[0033] S110, obtaining a first associated source language sentence and the first sentence of the source language sentence;

[0034] It understood that the present application provides a solution can be applied to long...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a training sample enhancement method and device, computer equipment and a storage medium. The method comprises the following steps: acquiring a first source language statement and an associated statement of the first source language statement; splicing at least one statement in the associated statements with the first source language statement to obtain a second source language statement; determining a second target language statement according to the second source language statement; and forming a statement pair by the second source language statement and the second target language statement, and taking the statement pair as a training sample. The second source language statement is real data instead of pseudo data, so that the negative influence of noise generated by the pseudo data on the machine translation model can be reduced or avoided.

Description

Technical field [0001] The present application relates to the field of machine translation technology, particularly to a method and apparatus for enhancing the training samples, machine translation model training method and apparatus, computer equipment, and storage media. Background technique [0002] Currently, machine translation is a translation obtained by the model training sample translation to translate the source language into the target language. The translation model training requires a lot of training samples, training sample that is the source language and the target language sentence right, its high acquisition costs, especially in minority languages, the need for specialized translators are labeled, so more and more popular in the technical staff to expand the training sample data through enhanced ways to reduce costs. [0003] Currently, data enhancement methods are replacing words, back translation methods. Among them, the words replace, for example, "this cool a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F40/58
CPCG06F40/58
Inventor 张轩玮
Owner BEIJING IQIYI TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products