Text abstraction method based on TF-IDF
A TF-IDF and text technology, which is applied in unstructured text data retrieval, text database browsing/visualization, special data processing applications, etc., can solve problems such as huge computing resources and long-term training of RNN
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0053] The embodiment of the invention is described in conjunction with the attached drawings, and the Chinese text abstract is mainly divided into the following steps,
[0054] S1 Chinese word segmentation
[0055] Chinese refers to dividing a continuous sequence composed of Chinese characters and other regular characters into individual words according to the Chinese understanding method. During the implementation process, the jieba word segmentation tool can be used to segment the text. The sentence after word segmentation is as follows: figure 2 As shown, you can see that the sentence is split into individual words
[0056] S2 to stop words
[0057] Normal Chinese text usually contains special symbols such as periods, commas, and semicolons. After the word segmentation is completed, these punctuation marks do not need to continue to exist. Secondly, the sentence contains some words that have little impact on the importance of the sentence, such as 的, 了, not only, but als...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com