Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method for testing categorical data set

A technology for classifying data and test sets, applied in the field of multi-label classification, can solve the problem of low classification accuracy

Inactive Publication Date: 2015-11-25
CHINA UNIV OF GEOSCIENCES (WUHAN)
View PDF2 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The invention provides a method for testing the classification data set to solve the technical problem of low classification accuracy in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for testing categorical data set
  • Method for testing categorical data set
  • Method for testing categorical data set

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0053]The core point of the present invention is that, in view of the fact that the Naive Bayesian multi-label classification algorithm ignores the feature of 'different attributes have different importance for class label selection' when performing data classification, a double-weighted Naive Bayesian multi-label classification is proposed. method to classify a classification dataset. According to the importance of the attribute characteristics of different items on the decision-making of different class labels in the decision-making class label set, each attribute and the edge between each class label are weighted, that is to say, each attribute feature and each class label Labels are doubly weighted.

[0054] Specifically, the present invention adopts the niche culture algorithm to learn and optimize the double weights in the double weighted naive Bayesian multi-label classifier, and obtain the optimal weight combination to be substituted into the current double weighted na...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method for testing a categorical data set. The method includes the steps that after the categorical data set is obtained, if the categorical data set needs to be processed in a standardization mode, the categorical data set is standardized in an absolute standard deviation mode; the categorical data set is divided into a training set and a test set, an ecological niche cultural algorithm is used for learning to obtain dual weight values of a dual weighted naive Bayes multi-label classifier, and then the training set is trained to obtain optimized weight values; the optimized weight values are substituted into the test set for prediction. A data training process is added on the basis of a traditional naive Bayes multi-label algorithm, and then the categorical data set is predicted. As traditional data classification is improved through a particle swarm optimization algorithm, the improved algorithm can improve the classification accuracy.

Description

technical field [0001] The present application relates to the technical field of multi-label classification, and in particular to a method for testing a classification data set. Background technique [0002] Multi-label learning is derived from text classification problems where each document may belong to several predefined topics: health and government. But now, this type of problem also exists in real-life applications very widely: in the field of video search, each audio clip can be divided into different emotional labels, such as "cheerful" and "joyful"; in gene function, Genes may correspond to multiple functional labels, such as "tall" and "fair skin"; in the field of image attribution, an image may belong to several scene labels at the same time, such as "big tree" and "tall building". All of these, the multi-label classification problem is widely used in more and more practical applications, and a deeper study of it will bring greater benefits to our daily life. C...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F16/353
Inventor 颜雪松
Owner CHINA UNIV OF GEOSCIENCES (WUHAN)
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products