Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Classification prediction method and classification prediction device for non-balanced data sets

A technology for unbalanced data and classification prediction, which is applied in the direction of instruments, character and pattern recognition, computer parts, etc., and can solve the problems that the data can no longer truly reflect the distribution characteristics, the classification performance is improved, and the prediction results of the majority class are reduced.

Inactive Publication Date: 2018-04-03
TAIYUAN UNIV OF TECH
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] However, the classification performance of these two methods for unbalanced data: resampling method and cost-sensitive method has not yet achieved the desired effect
Both of them will increase the classification performance of the minority class to varying degrees, while the classification performance of the majority class will decrease.
The defect of this method is that on the one hand, the proportion of the two types of data in the original data set is changed, so that the resampled data can no longer truly reflect the distribution characteristics of the original data; on the other hand, the classifier is overtrained on the minority data. , so that the training ability of the majority class data is greatly reduced, which leads to the improvement of the minority class prediction performance, while the majority class prediction results are reduced.
However, this method will also reduce the classification performance of the majority class when the cost sensitivity factor of the minority class increases.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Classification prediction method and classification prediction device for non-balanced data sets
  • Classification prediction method and classification prediction device for non-balanced data sets

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0053] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0054] The purpose of the present invention is to provide a classification prediction method for unbalanced data sets, perform performance evaluation on the current classification results based on the classifier evaluation index to determine the current reward and punishment function, introduce the reward and punishment function to improve the classification performance of the minority class, and at the same time The classification performance of most classes c...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a classification prediction method and a classification prediction device for non-balanced data sets. The classification prediction method comprises the steps of obtaining training sample sets in the non-balanced data sets and an optimal classification result corresponding to the training sample sets; based on a current objective function, classifying the training sample sets to obtain a current classification result; judging whether the current classification result is consistent with the optimal classification result or not; if yes, taking the current objective function as an optimal objective function, or otherwise, based on classifier evaluation indexes, performing performance evaluation on the current classification result to determine a current rewards and punishment function; and according to the current rewards and punishment function, correcting the current objective function to obtain a current corrected objective function, and by taking the current corrected objective function as the current objective function, performing re-classification. The rewards and punishment function is introduced; and according to the current rewards and punishment function, the current objective function is continuously corrected to obtain the optimal objective function, so that accurate classification prediction of the non-balanced data sets is realized.

Description

technical field [0001] The invention relates to the technical field of unbalanced data set classification, in particular to a classification prediction method and a classification predictor for unbalanced data sets. Background technique [0002] Classification predictor is an important information processing technology used in many industries to predict the probability or possibility of the future development of an event. Using a classification predictor to classify the data representing the event can be used in the analysis of many industry data to predict the possibility of the corresponding event. [0003] However, many industry data are typical unbalanced data. Taking the binary classification problem as an example, the characteristic of unbalanced data is that the proportion of one class of data is much higher than that of the other class. Here, the data with a high proportion is called the majority class, and the data with a low proportion is called the majority class...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/62
CPCG06F18/214G06F18/24323
Inventor 李凤莲张雪英焦江丽王灿李坤奇黄丽霞孙颖陈桂军
Owner TAIYUAN UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products