Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method, device, electronic device and storage medium for dealing with unbalanced data categories

A technology for processing data and categories, applied in nuclear methods, character and pattern recognition, instruments, etc., can solve problems such as unbalanced data categories and unbalanced data

Active Publication Date: 2021-06-15
TENCENT TECH (SHENZHEN) CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] This application provides a method, device, electronic equipment and storage medium for dealing with unbalanced data categories, and improves the solution to the problem of unbalanced data using SMOTE, which can improve the classification performance of SMOTE

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method, device, electronic device and storage medium for dealing with unbalanced data categories
  • Method, device, electronic device and storage medium for dealing with unbalanced data categories
  • Method, device, electronic device and storage medium for dealing with unbalanced data categories

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] The following will clearly and completely describe the technical solutions in the embodiments of the application with reference to the drawings in the embodiments of the application. Apparently, the described embodiments are only some of the embodiments of the application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of this application.

[0035] The solution provided by this application may involve cloud technology.

[0036] Cloud computing (cloud computing) refers to the delivery and use mode of IT infrastructure, which refers to obtaining the required resources through the network in an on-demand and easy-to-expand manner; cloud computing in a broad sense refers to the delivery and use mode of services, which refers to the on-demand, Get the services you need in an easily scalable way. Such services can be IT and software, ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present application provides a method, device, electronic equipment and storage medium for dealing with unbalanced data categories. The field of big data processing involving cloud technology. This application is based on the minority class sample X i with X i The mutual information between each neighbor sample of the M nearest neighbor samples are determined; based on X i with X ij(near) The mutual information between determine X ij(near) Mutual information weights; based on X ij(near) type and X ij(near) The mutual information weights determine X ij(near) weight W ij(near) ; Based on W ij(near) and category imbalance multiplier N to determine X i with X ij(near) The number of minority class samples to be inserted between N j ; at X i with X ij(near) Insert N between j a new sample. By combining mutual information and SMOTE to deal with the imbalance of data categories, the classification performance of SMOTE can be improved.

Description

technical field [0001] The embodiments of the present application relate to the field of cloud technology, in particular to the field of big data processing of cloud technology, and more specifically, to a method, device, electronic device, and storage medium for processing unbalanced data types. Background technique [0002] The problem of data category imbalance is a common problem that affects the performance of classification models. [0003] At present, the widely used method to solve the problem of data imbalance is Synthetic Minority Oversampling Technique (SMOTE). Different from general oversampling techniques, SMOTE’s newly added minority class samples are not obtained by repeated sampling, but a new sample is synthesized by interpolation between two minority class samples, that is, new samples are added within the minority class distribution boundary. samples, and add new samples to the minority class, so as to achieve the effect of class balance. The samples gen...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06K9/62G06N20/10
CPCG06N20/10G06F18/24147G06F18/24155G06F18/2411G06F18/214
Inventor 刘志煌
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products