A Text Feature Selection Method Based on Imbalanced Dataset
A data set and balanced technology, applied in special data processing applications, unstructured text data retrieval, text database clustering/classification, etc., can solve the problem of not fully considering the important factors affecting feature selection, and the large amount of information gain calculation And other issues
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0030] In order to check the advantages and disadvantages of the present invention, it can be checked and verified by the following several evaluation indicators.
[0031] See Table 1. Recall and precision are commonly used in unbalanced data classification to measure the classification quality of the model, and the F1 value is a comprehensive consideration of the classification performance of the two classes, taking into account both positive and negative classifications. Average of precision.
[0032] Table 1
[0033]
[0034] Among them, TP (True Positive) refers to the positive class correctly classified by the classifier; TN (True Negative) refers to the negative class correctly classified by the classifier; FP (False Positive) refers to the positive class incorrectly classified by the classifier; FN (FalseNegative) refers to the negative class misclassified by the classifier.
[0035] recall
[0036] Precision
[0037] F1 value:
[0038] The data set in the...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com