Text classification method for different subject topics
A text classification and subject technology, applied in text database clustering/classification, unstructured text data retrieval, special data processing applications, etc. problem, to achieve the effect of improving the average classification accuracy, classification accuracy, and improving the accuracy
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0020] Aiming at the deficiencies of existing methods, this program designs a new secondary classification processing method, which determines effective classification strategies according to different stages on the basis of selecting feature words. In order to make the feature words in the dictionary as representative as possible, this program uses chi-square test to select words. Chi-square test is a hypothesis testing method specially used for correlation analysis in statistics. Its model includes statistics on the frequency of related documents, which is more reliable than only counting word frequency, and the chi-square test is obtained in each category A set of feature words, which is more targeted than the feature words obtained in aggregate using information gain.
[0021] After using the chi-square test to obtain the feature words, the document can be expressed as a vector composed of these feature words, and then how to classify it is considered. Since the vocabular...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com