Two-stage combined file classification method based on probability subject
A text classification and subject heading technology, applied in special data processing applications, instruments, electrical digital data processing, etc., to achieve the effect of improving efficiency and good classification effect
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0015] The present invention will be described in detail below in conjunction with the accompanying drawings. It should be pointed out that the described embodiments are only considered for the purpose of illustration and not limitation of the present invention.
[0016] According to the present invention, the proposed two-level combined text classification method based on probabilistic subject words, when manually classifying, if people judge which category a text belongs to, they often only need to observe some key words in the text to get the correct judge. These key words are generally called subject words, which are included in many classified dictionaries. However, it is impossible to give a strict formal definition of subject terms. In the corpus learning method, a statistical topic word can be defined, which is named as "probabilistic topic word" (Probabilistic Topic Word, PTW). Then the words are extracted from the corpus by means of statistics. Then use these "st...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com