Text generation model training method, target corpus expansion method and related device

A technology for generating models and training methods, applied in the field of text processing, to achieve the effect of improving performance

Pending Publication Date: 2022-05-10
ZHEJIANG DAHUA TECH CO LTD
View PDF0 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The existing scheme is to train a text generation model first, and use the text generation model to iteratively generate the next word according to the given beginning word, and does not make good use of the existing corpus to guide the training of the text generation model

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text generation model training method, target corpus expansion method and related device
  • Text generation model training method, target corpus expansion method and related device
  • Text generation model training method, target corpus expansion method and related device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] The solutions of the embodiments of the present application will be described in detail below in conjunction with the accompanying drawings.

[0026] In the following description, for purposes of illustration rather than limitation, specific details, such as specific system architectures, interfaces, and techniques, are set forth in order to provide a thorough understanding of the present application.

[0027] The terms "system" and "network" are often used interchangeably herein. The term "and / or" in this article is just an association relationship describing associated objects, which means that there can be three relationships, for example, A and / or B can mean: A exists alone, A and B exist simultaneously, and there exists alone B these three situations. In addition, the character " / " in this article generally indicates that the contextual objects are an "or" relationship. In addition, "many" herein means two or more than two.

[0028] see figure 1 , figure 1 It ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a text generation model training method, a target corpus expansion method and a related device. The training method of the text generation model comprises the following steps: acquiring a sample corpus; performing word segmentation processing on the sample corpus, and generating a statistical language model according to a word segmentation processing result; generating a target text by using a generator of the text generation model; according to the sample corpus, utilizing a discriminator of a text generation model to discriminate the target text, outputting a discrimination result, and obtaining an adversarial loss function according to the discrimination result; acquiring the confusion degree of the target text by utilizing a statistical language model, and determining a penalty term according to the confusion degree; and superposing the confrontation loss function and the penalty term to obtain a target loss function of the text generation model, and training the text generation model by using the target loss function to obtain a trained text generation model. According to the scheme, the training of the text generation model can be guided by utilizing the existing corpus, and the performance of the text generation model is improved.

Description

technical field [0001] The present application relates to the technical field of text processing, in particular to a text generation model training method, target corpus expansion method and related devices. Background technique [0002] Speech recognition includes two parts: an acoustic model and a language model. Among them, the statistical language model is the most widely used in the language model, but the statistical language model requires the support of a large-scale corpus. In fact, it is difficult to obtain the corpus, especially in specific scenarios. The corpus is very scarce. . Automatic text generation is an important technology in the field of artificial intelligence and natural language processing technology. It is an effective method to solve the lack of corpus. This technology has a wide range of application scenarios, such as common "robot writing", "automatic dialogue generation", "Chinese Lyrics Automatic Generator", etc. [0003] The text generation t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06N3/04G06N3/08G06F40/289
CPCG06N3/084G06F40/289G06N3/044
Inventor 岳昌洁张锦铖黄惠祥史巍林聚财殷俊
Owner ZHEJIANG DAHUA TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products