Feedback density peak value clustering method and system thereof

A density peak and clustering method technology, applied in character and pattern recognition, instruments, computer parts, etc., can solve the problems that affect the clustering results and the low accuracy of the density peak clustering algorithm in high-dimensional data sets. The effect of improving the accuracy rate, accurate clustering, and reducing the error rate

Inactive Publication Date: 2017-08-04
CHINA UNIV OF MINING & TECH
View PDF0 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, when multiple density peaks appear in a class during clustering, abnormal sample points will be selected as pseudo-clustering centers, and a class will be divided into multiple classes, which will affect the clustering results. Not very accurate on dimensional datasets

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Feedback density peak value clustering method and system thereof
  • Feedback density peak value clustering method and system thereof
  • Feedback density peak value clustering method and system thereof

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0025] like figure 1 As shown, this implementation case includes the following steps:

[0026] Input: dataset X={x 1 , x 2 , x 3 , ... x n}, truncation distance d c , the combined index d.

[0027] Output: Cluster result labels.

[0028] Step 1, use non-negative matrix factorization to extract features from the data set, and the calculation formula is as follows:

[0029]

[0030]

[0031] Step 2, perform initial clustering based on the density peak clustering algorithm.

[0032] Step 2.1: Calculate the distance between two data points to form a distance matrix d ij , for example, the coordinates of two points are a(x11,x12,...,x1n) and b(x21,x22,...,x2n), then the distance between these two data points:

[0033]

[0034] Step 2.2: Calculate the local density of the data points:

[0035]

[0036] Step 2.3: Calculate the distance property δ between the data point and the closest cell with higher density i , and its calculation formula is as follows:

[0...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a feedback density peak value clustering method and a system thereof. A problem that one type is divided into a plurality of types when a plurality of density peak values are generated in one type in an original density peak value algorithm is solved. Simultaneously, accuracy of an original algorithm in a high dimension data set is increased. The method comprises the following steps of 1, using non-negative matrix factorization to carry out characteristic extraction on a data set; 2, according to an original density peak value clustering algorithm, drawing a decision graph and selecting a plurality of clustering centers; 3, using a ''nearest neighbor'' algorithm to distribute the rest of points and removing a noise point; 4, using a SVM to feed back a clustering result between two types; and 5, according to a feedback result, merging the types which can be merged. By using the method, robustness of a density peak value algorithm can be effectively increased, clusters with any shapes can be well discovered, high dimension data can be effectively processed, and a good clustering effect is possessed.

Description

technical field [0001] The present invention is a feedback density peak clustering method and system, which can automatically cluster on a data set of any shape, and relates to the fields of pattern recognition and machine learning. In particular, it involves using the SVM model to feed back the clustering results between two classes, and designing a new feedback strategy to merge the classes according to the support vector obtained from the SVM training to obtain accurate clustering results. Background technique [0002] Clustering analysis is unsupervised learning and is an important research direction of data mining. It can be roughly divided into five types of clustering algorithms: partition-based, hierarchy-based, model-based, density-based and grid-based. The density-based clustering algorithm regards clusters as high-density object regions separated by low-density regions in the data space, and the shape of the clusters has no benchmark, which can be used to filter o...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62
CPCG06F18/2321
Inventor 丁世飞徐晓杜明晶贾洪杰徐丽胡乾坤
Owner CHINA UNIV OF MINING & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products