Text abstract generation system, method, device and computer readable storage medium

A generation system and abstraction technology, applied in text database browsing/visualization, unstructured text data retrieval, special data processing applications, etc. The effect of asymmetry

Pending Publication Date: 2020-09-08
民生科技有限责任公司
View PDF0 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, there is still a problem with the generative summary, that is, the quality of the summary sequence will become worse and worse, because the word probability generated earlier will be high, but as the sequence prediction deepens, the conditional probability will decrease, so that the generated word probability will also decrease. The resulting phenomenon is that the generation accuracy of the first few words will be higher than that of the last few words. If the sequence is generated from right to left, the generation accuracy of the next few words will be higher than that of the first few words, that is, no matter from which Direction generation will have the problem of directional tilt

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text abstract generation system, method, device and computer readable storage medium
  • Text abstract generation system, method, device and computer readable storage medium
  • Text abstract generation system, method, device and computer readable storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0084] In the process of generating text summaries described in the present invention, it is especially applicable to the generation of news text summaries, and when used for generating news summaries, its relative accuracy rate is the highest. The generation of news summaries is used as a specific case below to illustrate:

[0085] The present invention uses a seq2seq framework to form training data with "news-abstract" pairs, and realizes a system for generating abstracts for news. The classic seq2seq framework is used for the variable-length sequence-to-sequence problem, the Bert Chinese pre-training model is used on the Encoder side to obtain news input information, and the two-way decoding mechanism is introduced on the Decoder side, which improves the quality of generated news to a certain extent. When generating the summary sequence at the end, according to the idea of ​​beam-search, the topK results in both directions are cached at the same time, and finally the one wit...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a text abstract generation system, a method, a device and a computer readable storage medium, and aims at forming training data through a seq2seq framework and text-abstract pairs to realize a system for generating abstracts for texts. A classic seq2seq framework is used for solving the problem from an uncertain length sequence to a sequence, a Bert Chinese pre-training model is used at an Encoder end to obtain text input information, a two-way decoding mechanism is introduced at a Decoder end, and the quality of a generated text is improved to a certain extent. When theabstract sequence is finally generated, top K results in the two directions are cached at the same time according to the beam-search thought, and finally the result with the highest probability is found and serves as output.

Description

【Technical field】 [0001] The present invention relates to the technical field of computer word processing, and in particular to a text abstract generation system, method, device and computer-readable storage medium based on a sequence-to-sequence framework of a bidirectional decoding mechanism. 【Background technique】 [0002] Text summarization refers to refining and summarizing the content, and summarizes the main content that users are concerned about with a concise and intuitive summary, which is convenient for users to quickly understand and browse massive content. [0003] With the advent of the Internet age, text summarization is becoming more and more important. First, there are more and more information explosion texts, and how to receive more information in a limited time has become a thorny issue; second, there is too much relevant information, redundant, one-sided, and impurity information leads to information overload; Finally, the popularization and use of mobi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/34
CPCG06F16/345Y02D10/00
Inventor 李振张刚鲍东岳尹正刘昊霖张雨枫陈厚霖彭加欣
Owner 民生科技有限责任公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products