Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

High-performance parallel implementation method of K-means algorithm on domestic Sunway 26010 multi-core processor

A k-means algorithm, many-core processor technology, applied in electrical digital data processing, instruments, computing and other directions, can solve problems such as lack of high performance, and achieve the effect of performance improvement and optimal memory access path

Active Publication Date: 2018-09-07
INST OF SOFTWARE - CHINESE ACAD OF SCI
View PDF3 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0009] The problem solved by the present invention is: to overcome the problem that there is no high-performance clustering algorithm on the existing Shenwei 26010 platform, and to provide a K-means parallel implementation framework that combines the block distance matrix calculation with the cluster label specification, through three Optimization methods such as layer partitioning, cooperative inter-core data sharing, double buffering, instruction rearrangement and dynamic scheduling, etc., make full use of the computing resources of the hardware platform and improve the computing performance of K-means

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • High-performance parallel implementation method of K-means algorithm on domestic Sunway 26010 multi-core processor
  • High-performance parallel implementation method of K-means algorithm on domestic Sunway 26010 multi-core processor
  • High-performance parallel implementation method of K-means algorithm on domestic Sunway 26010 multi-core processor

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0032] The present invention will be described in detail below in conjunction with the accompanying drawings and illustrations.

[0033] Such as figure 1 As shown, the present invention is a high-performance parallel implementation method of the K-means algorithm based on the domestic Shenwei 26010 many-core processor. The calculation process of K-means is an iterative solution process. The whole process is divided into four main steps: initializing the center point, clustering, calculating the iterative convergence value, and updating the center point. First, adopt a certain method of initializing the center point, and assign the initial value to the center point matrix; then perform initial clustering, find out the iterative convergence value, and judge whether it is converged. If not, enter the main loop. If converged, return the current clustering result. In the main loop, two operations are mainly performed: clustering and updating center points. Exit the main loop whe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a high-performance parallel implementation method of a K-means algorithm on a domestic Sunway 26010 multi-core processor. Based on a domestic processor Sunway 26010 platform, for a clustering stage, the invention designs a calculation framework fused with a block distance matrix calculation and a convention operation; the framework adopts a three-layer blocking strategy fortask partitioning; and meanwhile a collaborative internuclear data sharing scheme and a cluster label convention method based on a register communication mechanism are designed; and a double bufferingtechnology and instruction rearrangement optimization techniques are adopted. For the stage of updating a center point, the invention designs a dynamic scheduling task partitioning mode. According tothe high-performance parallel implementation method of the K-means algorithm on the domestic Sunway 26010 multi-core processor, through test on a real data set, the floating-point calculation performance with maximum 348.1GFlops can be achieved; compared with the theoretical maximum performance, 47%-84% of floating-point calculation efficiency can be obtained; and compared with non-fused calculation mode, an accelerative ratio of 1.7x at most and 1.3x on average can be obtained.

Description

technical field [0001] The invention belongs to the research field of parallel acceleration of clustering algorithms in machine learning, and specifically relates to a high-performance parallel implementation method of K-means algorithm on domestic Shenwei 26010 many-core processors. Background technique [0002] K-means is a classic clustering algorithm based on distance calculation in unsupervised learning. It divides the sample data into different clusters according to the similarity measure between samples, and maximizes the similarity between samples in the same cluster. Due to its simplicity, easy implementation, and no need to label samples, K-means has a wide range of applications in image processing, data mining, text clustering, biology, etc., and is increasingly used as a predictor for many more complex algorithms. processing means. With the advent of the era of big data, the feature dimension of the sample data has increased from the original tens of dimensions ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/50G06F9/48
CPCG06F9/4881G06F9/5038
Inventor 杨超李敏闫碧莹
Owner INST OF SOFTWARE - CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products