Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

ML-kNN (machine learning-k-nearest neighbor) improving method and ML-kNN improving system applicable to multi-label classification

A multi-label and label technology, which is applied in the field of ML-kNN improvement methods and systems, can solve the problems of not considering label correlation, decreasing label discrimination, increasing classification error, etc.

Inactive Publication Date: 2017-09-05
INST OF COMPUTING TECH CHINESE ACAD OF SCI
View PDF2 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, ML-kNN still has some deficiencies. First, due to the multi-label feature of the sample, it does not distinguish the feature vectors corresponding to different labels of the same sample. That is, for the same sample, if it has several different labels, ML -The kNN method considers that these tags have the same feature vector, which leads to a decrease in the degree of discrimination between tags and increases the classification error; secondly, in the distance calculation of samples, ML-kNN uses the classic cosine similarity as the sample distance This calculation method does not take into account the correlation between labels. For example, in the medical diagnosis data set, the two disease labels "bronchopneumonia" and "bronchitis" have a strong correlation, and this Correlation will have a certain impact on the calculation of distance, which is not considered by the ML-kNN method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • ML-kNN (machine learning-k-nearest neighbor) improving method and ML-kNN improving system applicable to multi-label classification
  • ML-kNN (machine learning-k-nearest neighbor) improving method and ML-kNN improving system applicable to multi-label classification
  • ML-kNN (machine learning-k-nearest neighbor) improving method and ML-kNN improving system applicable to multi-label classification

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] The invention provides an improved ML-kNN method suitable for multi-label classification, with the goal of improving the performance of ML-kNN in multi-label classification.

[0037] To achieve the above object, the technical scheme adopted in the present invention is as follows:

[0038] Step 1: Obtain the original data set, which includes multiple samples and involves a total of C label categories, where each sample has multi-class labels and multi-class features, and counts the total number of samples of each class of labels in the original data set , as the number of label samples, count the total number of samples of each type of feature in the samples of each type of label, as the number of feature samples, and calculate the feature label weight according to the number of label samples and the number of feature samples, where each feature corresponds to a feature value . Calculate the feature label weight, and its specific implementation method includes the follo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to an ML-kNN (machine learning-k-nearest neighbor) improving method and an ML-kNN improving system applicable to multi-label classification. The ML-kNN improving method includes counting the sum of samples of each class of labels in original data sets, utilizing the sum of the samples of each class of labels in the original data sets as a label sample number, counting the sum of samples in each class of features in the samples of each class of labels, utilizing the sum of the samples of each class of features in the samples of each class of labels as a feature sample number and computing feature label weights according to the label sample numbers and the feature sample numbers; splitting each sample in the initial data sets into a plurality of original single-label samples with single labels, and updating feature values of each original single-label sample according to the feature label weights to generate first data sets; acquiring to-be-measured samples to be predicted, splitting the to-be-measured samples into to-be-measured single-label samples with single labels, sequentially predicting the labels of the to-be-measured single-label samples according to the first data sets and determining label sets of the to-be-measured samples. Each feature corresponds to the single corresponding feature value. The ML-kNN improving method and the ML-kNN improving system have the advantage that accurate prediction results of the samples in the aspect of multi-label classification can be obtained.

Description

technical field [0001] The invention relates to the field of machine learning, in particular to an improved ML-kNN method and system suitable for multi-label classification. Background technique [0002] In traditional single-label classification, it is learned from a series of samples with only one label l, where l comes from the label set L, |L|>1. If |L|=2, the learning problem is called a binary classification problem; if |L|>2, the learning problem is a multi-classification problem. However, in multi-label classification, a sample often has several labels Y, where In reality, there are many label classification problems, such as text classification, a text may be both sports and politics; and for medical disease diagnosis, a patient often has multiple complications, such as a patient may have respiratory tract infection at the same time , bronchitis and pneumonia three diseases. The paper (Tsoumakas G, Katakis I.Multi-Label Classification:AnOverview[J].Interna...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30G06K9/62
CPCG06F16/355G06F18/24155G06F18/214
Inventor 刘鹏鹤孙晓平孙毓忠
Owner INST OF COMPUTING TECH CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products