The invention discloses a cluster implementation method and
system, wherein the method comprises the following steps: carrying out sharding on candidate samples in a candidate
queue by a master control node; and respectively determining whether each sample in allocated samples subject to sharding is a
core sample parallelly according to a preset epsilon neighborhood and the
minimum density by at least two computing nodes, thus due to the
parallel processing of the computing nodes, the marking speed of a cluster to which each sample in a sample
database belongs is quickened. The invention also discloses another cluster implementation method and
system, and the cluster implementation method comprises the following steps: carrying out sharding on samples which are not marked currently in a sample
database by a master control node; allocating and issuing the samples subject to sharding to at least two computing nodes; carrying out
parallel processing on candidate samples in a candidate
queue by the computing nodes; and combining the obtained
processing results of the computing nodes by merge nodes. Because each computing node only processes part of samples, the problem that
mass data can not be processed by one computer is solved, and because the
mass data can be subject to
parallel processing by a plurality of the computing nodes and a plurality of the merge nodes, the
processing efficiency is greatly improved.