Characteristic word extraction method, text similarity calculation method, device and equipment
An extraction method and feature word technology, applied in the computer field, can solve the problem of not directly giving feature words and low accuracy, and achieve the effect of expanding the scope of screening and improving the accuracy
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0031] Embodiment 1 of the present invention provides a feature word extraction method, which is used to extract feature words of a target text by using an improved TF-IDF algorithm. specifically, figure 1 A flow chart of the feature word extraction method according to Embodiment 1 of the present invention is schematically shown. Such as figure 1 As shown, the feature word extraction method may include steps S101 to S106, wherein:
[0032] Step S101, in response to a word segmentation instruction for the target text, perform word segmentation on the target text to obtain a word segmentation set.
[0033] Among them, the target text can be any text, such as papers, patents or technical articles. A participle can be a word or a word, for example, a participle is "most", and another example is "similar".
[0034] One solution is: the word segmentation set includes all the word segmentations that make up the target text.
[0035] For example, the target text is "Beijing welco...
Embodiment 2
[0069] Embodiment 2 of the present invention provides a text similarity calculation method. Some steps of the text similarity calculation method are corresponding to the steps in the above-mentioned embodiment 1. These steps will not be repeated in this embodiment 2. Specifically Reference may be made to the first embodiment above. specifically, figure 2 A flow chart of a method for calculating text similarity according to Embodiment 2 of the present invention is schematically shown. Such as figure 2 As shown, the text similarity calculation method may include steps S201 to S204, wherein:
[0070]Step S201, selecting the characteristic words of the target text, wherein the characteristic words of the target text are selected through the method described in the first embodiment.
[0071] Step S202, input the feature words into the first text retrieval database to obtain several first texts.
[0072] In this embodiment, the first text retrieval library is composed of text ...
Embodiment 3
[0078] Embodiment 3 of the present invention provides a method for calculating text similarity. Some steps of the method for calculating text similarity are the same as those in Embodiment 1 and Embodiment 2 above. These steps are not included in Embodiment 3. For further details, reference may be made to the above-mentioned Embodiment 1 and Embodiment 2 for details. specifically, image 3 A flow chart of a method for calculating text similarity according to Embodiment 3 of the present invention is schematically shown. Such as image 3 As shown, the text similarity calculation method may include steps S301 to S307, wherein:
[0079] Step S301, selecting the characteristic words of the target text, wherein the characteristic words of the target text are selected through the method described in the first embodiment.
[0080] Step S302, input the feature words into the first text retrieval database to obtain several first texts.
[0081] Step S303, expand the feature words to...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com