The invention discloses a method and a device for realizing an association
rule mining algorithm supporting a distributed computation. An HDFS (Hadoop
Distributed File System)
programming model is used to carry out two-stage analysis of a map function stage and a reduce function stage on the association
rule mining algorithm, and the analysis steps comprises the following steps: step 1, a
job scheduler is configured; step 2, a
data set is read by a
prior probability mapping module, and the data of the
data set are converted by a map function into a value pair; step 3, the value pair processed in the step 2 is read by the
prior probability reduction module, an ordering rule Top N containing an i item set is randomly generated by a reduce function, and the
prior probability distribution value of a confidence coefficient is calculated at the same time; step 4, the same
data set is read by a rule mapping module, and the data row of the data set is converted by the map function into the value pair; and step 5, the value pair processed in the step 4 and the prior probability distribution value in the step 3 are read by a rule reduction module, and the predication accuracy value of the ordering rule Top N is calculated by the reduce function. The method and the device for realizing the association
rule mining algorithm supporting the distributed computation are mainly applied to the PA (Pridictive Apriori)-distribution type computing technology.