Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and apparatus for nondestructive grouping of unordered categorical variable information

A technology of categorical variables and non-destructive grouping, applied in informatics, special data processing applications, instruments, etc., can solve problems such as low efficiency, inability to guarantee effective, poor effect, etc. The effect of fast operation

Inactive Publication Date: 2016-11-09
深圳前海信息技术有限公司
View PDF1 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

As for how to achieve effective information lossless grouping of unordered categorical variables, two processing methods are basically adopted: one is to group by experience, which is extremely inefficient and cannot be guaranteed to be effective; Grouping is used directly. In this way, when the value distribution of unordered categorical variables is very wide, the effect in subsequent modeling and other applications is often very poor.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and apparatus for nondestructive grouping of unordered categorical variable information
  • Method and apparatus for nondestructive grouping of unordered categorical variable information
  • Method and apparatus for nondestructive grouping of unordered categorical variable information

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0035] refer to figure 1 , is a schematic diagram of method steps for losslessly grouping unordered categorical variable information in an embodiment of the present invention.

[0036] In one embodiment of the present invention, a method for non-destructive grouping of unordered classification variable information is proposed, including:

[0037] Step S1, under the supervision of the binary target variable, calculate the weight of evidence for the value of each category in the unordered categorical variable;

[0038] Step S2, divide the above-mentioned evidence weight values ​​into equal-depth groups, divide them into M intervals, and use the above-mentioned M intervals as the grouping of unordered categorical variables.

[0039] At present, for the grouping of unordered categorical variables to achieve effective in...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and an apparatus for nondestructive grouping of unordered categorical variable information. The method comprises the steps of calculating an evidence weight value for the value of each category in unordered categorical variables under the supervision of a two-value target variable; and performing isobathic grouping on the evidence weight values, dividing the evidence weight values into M regions, and taking the M regions as groups of the unordered categorical variables. According to the method and the apparatus for the nondestructive grouping of the unordered categorical variable information, disclosed by the invention, the grouping process is simple and easy to understand, the calculation speed is high, and the distinguishing capability of the unordered categorical variables for the target variable can be well reserved.

Description

technical field [0001] The invention relates to the field of grouping of disordered classification variables, in particular to a method and device for grouping information of disordered classification variables without loss. Background technique [0002] With the development of technologies such as the Internet, cloud computing, and the Internet of Things, the amount of data in various industries has increased explosively. Among these data, disordered categorical variables account for a large part. For these disordered categorical variables, There needs to be a fast and effective method for data preprocessing, so as to quickly discover the value in the data. [0003] At present, for the problem of variable grouping in data preprocessing, most people study the grouping or binning of continuous variables. As for how to achieve effective information lossless grouping of unordered categorical variables, two processing methods are basically adopted: one is to group by experience...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F19/00
CPCG16Z99/00
Inventor 梁猛王界兵张伟李杰韦辉华郭宇翔
Owner 深圳前海信息技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products