Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Automatic text abstract generation method based on dynamic word vectors

An automatic text and word vector technology, applied in neural learning methods, unstructured text data retrieval, text database browsing/visualization, etc., can solve the problem of long text length, low quality and efficiency of text summary generation, and time-consuming training. and other problems to achieve the effect of high accuracy and fluency

Inactive Publication Date: 2019-12-27
10TH RES INST OF CETC
View PDF3 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The rapid development of deep neural networks in recent years has provided the possibility to build excellent generative summary models due to their powerful representation capabilities. Many generative neural network models have surpassed the best extractive models on the public test set. , but it is currently limited by problems such as too long text length and poor content extraction
The traditional recurrent neural network (RNN) is very suitable for text sequence modeling, but the training is very time-consuming due to the inability to parallelize the calculation. At the same time, the multi-step recursive loop has long-term dependence problems such as gradient disappearance, explosion, and semantic loss, which leads to text summarization. The quality and efficiency of the generation is not high; in response to this defect, Facebook AI Labs proposed a more efficient ConvS2S model based on convolutional neural network (CNN), but CNN has the problem that it cannot directly process variable-length text sequences; the Google team thoroughly Abandoning the traditional CNN and RNN, the Transformer model is proposed based entirely on the attention mechanism, which not only improves the defects of RNN that are difficult to parallelize and long-term dependence, but also solves the problem that CNN is difficult to process variable-length sequence samples

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Automatic text abstract generation method based on dynamic word vectors
  • Automatic text abstract generation method based on dynamic word vectors
  • Automatic text abstract generation method based on dynamic word vectors

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0013] refer to figure 1 . According to the present invention, the text is first preprocessed by the text preprocessing module, including word segmentation operation, high-frequency word filtering and part-of-speech tagging, and the processed text is generated into an initial word vector; then the initial word vector is input into the ELMo model module to generate Preliminary dynamic word vector; at the same time, input the text into the Doc2Vec sentence vector module to obtain the sentence vector of each sentence, and then input the sentence vector into the self-attention mechanism module to calculate the importance weight of each sentence to the summary result to generate a weighted sentence vector. Use the weighted sentence vector as the environmental feature vector of each word; then add this environmental feature vector to the initial dynamic word vector to obtain the final dynamic word vector, and input this dynamic word vector into the Transformer neural network model t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an automatic text abstract generation method based on dynamic word vectors, and aims to provide an automatic text abstract generation method with higher accuracy and fluency. According to the technical scheme, the method comprises the steps of firstly preprocessing a text through a text preprocessing module, wherein the preprocessing comprises word segmentation operation, high-frequency word filtering and part-of-speech tagging, and forming an initial word vector from the processed text; inputting the initial word vector into an ELMo model module to generate a preliminary dynamic word vector; meanwhile, inputting the text into a Doc2Vec sentence vector module to obtain a sentence vector of each sentence; inputting the sentence vector into a self-attention mechanismmodule to calculate an importance weight relative to the abstract result so as to generate a weighted sentence vector; and taking the weighted sentence vector as an environment feature vector of eachword, adding the environment feature vector and the initial dynamic word vector to obtain a final dynamic word vector, and inputting the dynamic word vector into a Transformer neural network model togenerate a high-quality text abstract.

Description

technical field [0001] The invention belongs to the technical field of natural language processing, and in particular relates to a deep neural network algorithm for automatically generating text summaries. Background technique [0002] With the rapid development and widespread popularization of the Internet in recent years, the amount of information data has exploded exponentially, and the problem of information overload has become increasingly apparent. People need to face and process massive amounts of text information every day. How to efficiently obtain important and key content from a large amount of text information, automatic generation of text summaries has become an urgent need. Text summarization is full of every aspect of our life. The refinement of news keywords is text summarization, and the optimization of search engines such as Google and Baidu also uses text summarization. Automatic generation of text summarization is currently a relatively efficient method ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/34G06F17/27G06N3/04G06N3/08
CPCG06F16/345G06N3/08G06N3/045
Inventor 王侃曹开臣刘万里徐畅潘袁湘
Owner 10TH RES INST OF CETC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products