Abstract:
Aiming at the memory dependence of the classic Apriori algorithm, it is only suitable for small-scale datasets, it seems to be powerless in the face of massive datasets, and the algorithm does not consider the user's needs.The improved algorithm of Apriori pre-term constraint association rules based on MapReduce is proposed. Firstly, the method of the classic Apriori algorithm mining process is improved, and the user's pre-and post-item constraint rules are added, which makes the pruning degree more in the mining process and obtains more precise rules. Then, using the MapReduce programming technology of cloud computing, the steps of the improved Apriori algorithm are parallelized. The experimental results show that the improved algorithm has certain advantages in dealing with different data sets. After parallelization by MapReduce model, it improves the processing ability and efficiency of massive data and has good scalability.