Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Classification method for unbalanced data based on lifting degree decision tree and improved SMOTE

A classification method and decision tree technology, applied to instruments, character and pattern recognition, computer components, etc., can solve problems such as inaccurate test data classification results, and achieve the effect of improving accuracy

Pending Publication Date: 2020-12-11
XIAN UNIV OF TECH
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The purpose of the present invention is to provide a classification method for unbalanced data based on the promotion decision tree and improved SMOTE, which solves the problem of inaccurate classification results of a large number of unbalanced classifications and test data in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Classification method for unbalanced data based on lifting degree decision tree and improved SMOTE
  • Classification method for unbalanced data based on lifting degree decision tree and improved SMOTE
  • Classification method for unbalanced data based on lifting degree decision tree and improved SMOTE

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0060] The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0061] Such as figure 1 As shown, based on the promotion decision tree and the improved SMOTE classification method for unbalanced data, it is implemented according to the following steps:

[0062] Step 1. Preprocessing the data;

[0063] Step 2, use the improved SMOTE algorithm to process the data set after data preprocessing, and balance the unbalanced data set;

[0064] Step 3. Divide the balanced data set into training data and test data using a ten-fold cross-validation method;

[0065] Step 4, train the training data using the decision tree algorithm based on the lifting degree, and establish a decision tree model;

[0066] Step 5, use the established decision tree model to test the test data, and obtain the output result.

[0067] Step 1 is specifically:

[0068] The dataset is Data_set = {d 1 , d 2 , d 3 ,... d p},o=1,2,3...p...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a classification method for unbalanced data based on a lifting degree decision tree and improved SMOTE. The classification method is specifically implemented according to the following steps: preprocessing the data; processing the data set subjected to data preprocessing by using an improved SMOTE algorithm, and balancing the unbalanced data set; dividing the balanced dataset into training data and test data by using a ten-fold cross validation method; training the training data by using a decision tree algorithm based on the lifting degree, and establishing a decisiontree model; and testing the test data by using the established decision tree model to obtain an output result. According to the method, the classification accuracy is high, the composition process can be well explained, the construction process is simple and rapid, the classification accuracy of the algorithm cannot be affected by redundant attributes, and the method has good robustness for noisedata.

Description

technical field [0001] The invention belongs to the technical field of data mining methods, and relates to a method for classifying unbalanced data based on a promotion decision tree and improved SMOTE. Background technique [0002] With the rapid development of information technology and the popularization of big data and 5G technology in recent years, more and more fields generate massive amounts of data information. These massive amounts of information contain a lot of irrelevant and redundant content. At the same time, in the data of some fields, there will be a large amount of unbalanced classification data. Using these data for prediction or classification will cause inaccurate classification results of test data. [0003] For the first problem above: massive data contains a lot of irrelevant and redundant content. Use feature selection to process it, delete irrelevant or redundant features in the data and features that have little effect on classification, and retai...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62
CPCG06F18/24323G06F18/214
Inventor 周红芳张家炜
Owner XIAN UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products