Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Prediction method of unbalanced data set based on generative adversarial network

A prediction method and data set technology, applied in the direction of biological neural network model, prediction, data processing applications, etc., can solve the problems of difficulty and inability to generate minority samples, and achieve the effect of stable prediction results and high prediction accuracy

Pending Publication Date: 2021-08-24
XIAN UNIV OF TECH
View PDF0 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to provide a prediction method based on unbalanced data sets based on generative confrontation networks, which solves the problem that existing methods are very difficult or even impossible to generate minority samples when processing large data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Prediction method of unbalanced data set based on generative adversarial network
  • Prediction method of unbalanced data set based on generative adversarial network
  • Prediction method of unbalanced data set based on generative adversarial network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0106] In order to test the effect of the method proposed in the present invention in dealing with unbalanced data sets, the present invention uses bank telemarketing data sets as unbalanced data for testing.

[0107] The main process of the test of the method proposed by the present invention is: use DCGAN to obtain a balanced data set after processing the original data set (unbalanced data set), then train the CNN network with the divided data set, and finally use the trained CNN network model to predict the bank. The effectiveness of telemarketing campaigns. In particular, the present invention compares the application effect of the proposed method with that of Smoteen (a method often used to deal with imbalance, that is, Smote+ENN), to illustrate the effectiveness and feasibility of the proposed method.

[0108]In traditional classification learning methods, classification accuracy (the ratio of the number of correctly classified samples to the total number of samples) is ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an unbalanced data set prediction method based on a generative adversarial network. The method comprises the following steps: receiving a prediction request; collecting data to form a data set, and determining features and labels in the data set and the number of minority class samples and majority class samples; converting a non-numerical feature column and a label column in the data set into classification numerical values; standardizing the processed data set, and separating majority class samples and minority class samples in the standardized data set; synthesizing minority class samples a by using a deep convolutional adversarial network to form a balanced data set; dividing the balanced data set into a training set and a test set; constructing a convolutional neural network, and training the convolutional neural network by using the divided training set to obtain a trained convolutional neural network; finally, inputting the test set into the trained convolutional neural network to obtain a prediction result. According to the prediction method, the problem that minority class samples are very difficult to generate or even cannot be generated when big data are processed in an existing method is solved.

Description

technical field [0001] The invention belongs to the technical field of prediction methods for unbalanced data sets, and relates to a prediction method for unbalanced data sets based on generative confrontation networks. Background technique [0002] With the rapid development of information technology, data in various fields are being generated at an unprecedented speed and widely collected and stored. How to realize the intelligent processing of data and use the valuable information contained in the data has become a research hotspot in theory and application. Machine learning is a mainstream intelligent data processing technology. The classification problem is one of the important research contents in the field of machine learning. Some existing classification methods are relatively mature, and they can generally achieve better results in classifying balanced data. However, the data in the real world often have unbalanced characteristics, that is, the number of samples of ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/04G06N3/08G06Q10/04
CPCG06N3/084G06Q10/04G06N3/048G06N3/045
Inventor 王竹荣牛亚邦黑新宏
Owner XIAN UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products