Information bottleneck-based possibility fuzzy joint clustering method

An information bottleneck and clustering method technology, applied in the field of Internet data processing, can solve problems such as performance impact and inability to measure similarity well

Inactive Publication Date: 2017-05-31
HENAN POLYTECHNIC UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The existing fuzzy joint clustering and possibility clustering each have their own advantages and limitations. Fuzzy joint clustering is stable and suitable for high-dimensional sparse matrix clustering, but its performance is greatly affected by outliers; possibility The clustering method can better deal with outliers, but if there is no good initialization, consistent clustering often occurs; and in high-dimensional data, the Euclidean distance cannot measure the similarity between objects very well

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Information bottleneck-based possibility fuzzy joint clustering method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0017] Specifically, a possible fuzzy joint clustering method based on information bottlenecks includes the following steps:

[0018] (1) Set parameter T u ,T t ,T v ,w u ,w t , ε and τ max value, where T u , T t , T v 、w u and w t is a user-defined weighting parameter, ε is the maximum error limit, τ max is the maximum number of iterations;

[0019] (2) Let the number of iterations τ be 1, and initialize u randomly ci and t ci , where u ci is the division membership degree of sample i belonging to cluster c, t ci is the typical membership degree of sample i belonging to cluster c, and 0≤u ci ≤1,0≤t ci ≤1, c is a natural number from 1 to C, C represents the number of clusters, i is a natural number from 1 to N, and N represents the total number of all cluster samples;

[0020] (3) adopt update p cj , where p cj Indicates the jth attribute value of the centroid of cluster c, x ij Represents the jth attribute value of sample i, j is a natural number from 1 ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an information bottleneck-based possibility fuzzy joint clustering method. An information bottleneck principle is introduced, mutual information loss of the information bottleneck is adopted as a distance measurement mode, and advantages of possibility clustering and fuzzy joint clustering are combined at the same time to standardize the centroid in the clustering process. After the method is adopted, the sensitivity to initial values can be reduced, the robustness and the anti-noise ability are better, the clustering precision is higher, and a more distinct fuzzy partition result is generated.

Description

technical field [0001] The invention relates to a possible fuzzy joint clustering method based on information bottleneck, which belongs to the field of Internet data processing. Background technique [0002] Recent studies have pointed out that the total number of Internet sites in the world has exceeded 1 billion, and this number is still increasing. People's access to knowledge is gradually shifting from traditional books and newspapers to the Internet. With the rapid development of Internet technology, the amount of data is also increasing. Classifying data is a basic requirement for data processing and analysis. Clustering is an unsupervised data partitioning technique. According to the similarity and dissimilarity between objects, the data is aggregated into several clusters, so that the similarity of the data in the cluster can be maximized, and the dissimilarity of the data between the clusters can be maximized. . Now there are many mature clustering methods, but th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06F17/30
CPCG06F16/9535G06F18/23
Inventor 刘永利万兴晁浩刘志中郭倩倩
Owner HENAN POLYTECHNIC UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products