A method and a device for extracting keywords from a document text

An extraction method and keyword technology, applied in the field of document text keyword extraction

Pending Publication Date: 2019-01-11
BEIJING QIYI CENTURY SCI & TECH CO LTD
View PDF4 Cites 23 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Obviously, the document text keyword extraction method in the prior art only extracts keywords from the document text, and the extraction results may have keyword results that are not related to the subject of the document.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method and a device for extracting keywords from a document text
  • A method and a device for extracting keywords from a document text
  • A method and a device for extracting keywords from a document text

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0074] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0075] In order to prevent the keywords extracted from the document text from being irrelevant to the subject of the document, an embodiment of the present invention provides a method and device for extracting keywords from the document text, which will be described in detail below.

[0076] A method for extracting keywords from a document text provided by an embodiment of the present invention is firstly introduced below.

[0077] see figure 1 as shown, fig...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention provides a method and a device for extracting keywords from document text. The method comprises the following steps: obtaining a word vector corresponding to a title ofa target document; extracting keywords from the text of the target document to obtain at least one candidate keyword of the text of the document; acquiring a word vector corresponding to each candidate keyword; for each candidate keyword, determining the similarity between the word vector corresponding to the candidate keyword and the word vector corresponding to the title; determining candidatekeywords whose similarity satisfies preset conditions as final keywords of the target document body. When the document body keyword extracted by the embodiment of the invention is used for searching the target document, the body keyword consistent with the subject of the target document can be accurately obtained.

Description

technical field [0001] The invention relates to the field of natural language processing, in particular to a method and device for extracting keywords from document text. Background technique [0002] With the development of the information age and the explosive growth of text information on the Internet, how to effectively organize, classify and retrieve a large amount of information has become an issue that most Internet users are more and more concerned about. Keywords highly summarize the main content of the text, and it is obvious that the method of extracting keywords is the core issue of information retrieval. At the same time, keyword extraction plays a vital role in automatic literature, information retrieval, text classification, text clustering, etc. [0003] At present, the document text keyword extraction method mainly includes four steps: 1. Use the word segmentation tool to perform word segmentation processing on the document text to obtain the words in the d...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27
CPCG06F40/258G06F40/284Y02D10/00
Inventor 王亮
Owner BEIJING QIYI CENTURY SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products