Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Keyword extraction method and device based on graph and word and sentence collaboration

A keyword, word and sentence technology, applied in the field of keyword extraction method and device based on the synergy of graph and word and sentence, can solve the problem of inaccurate keywords and other problems

Inactive Publication Date: 2019-08-02
BEIJING UNIV OF POSTS & TELECOMM
View PDF3 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, in practical applications, the candidate word graph constructed by the TextRank algorithm has fewer edges and nodes, that is, fewer candidate words are selected, and it is only easy to extract candidate words with a high co-occurrence frequency (that is, high-frequency words) as keywords, which leads to the inaccuracy of the final extracted keywords

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Keyword extraction method and device based on graph and word and sentence collaboration
  • Keyword extraction method and device based on graph and word and sentence collaboration
  • Keyword extraction method and device based on graph and word and sentence collaboration

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0085] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0086] In order to solve the problem that the existing keyword extraction method for text using the TextRank algorithm results in inaccurate keywords, an embodiment of the present invention provides a method and device for extracting keywords based on the collaboration of graphs and words.

[0087] The following firstly introduces a keyword extraction method based on graph and word-sentence collaboration provided by the embodiment of the present invention.

[...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention provides a keyword extraction method and device based on graph and word and sentence collaboration. The method comprises the following steps of: based on candidate words and sentences obtained from a text of a to-be-extracted keyword, calculating the diffusivity corresponding to each two candidate words; calculating a first weight of an edge between every two candidate words in the undirected weighted graph based on the diffusivity, calculating a first index of each candidate word based on the first weight; calculating a second weight of an edge between every two sentences in the directed weighted graph, calculating a second index of each sentence based on the second weight, and obtaining a first index vector of the sentence based on a second index of the sentence, constructing a second index vector of each candidate word containing a fifth index of the candidate word based on the first index of each candidate word and the obtained first index vector, and extracting a keyword in the text based on a size sequence of the fifth indexes in the second index vector. According to the embodiment of the invention, the accuracy of extracting the keywords in the text can be improved.

Description

technical field [0001] The present invention relates to the technical field of keyword extraction, in particular to a method and device for extracting keywords based on collaboration of graphs and words. Background technique [0002] Keywords are representative words in a piece of text, and are a brief summary of the topic of an article. Keywords can reflect the subject content of a document / text, and help people quickly locate the subject and thought of the document / text. In addition, keywords have important application value in literature retrieval, text classification, recommendation system and so on. Since it is very time-consuming and difficult to manually label keywords of documents / texts, automatic keyword extraction has become a hot research direction in the field of NLP (Natural Language Processing, Natural Language Processing). [0003] The existing method for extracting keywords from texts is: using the TextRank algorithm to extract keywords from texts. The imp...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/27
CPCG06F40/216G06F40/289
Inventor 熊翱郭庆邱雪松孟洛明刘德荣
Owner BEIJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products