Abstract:
For the problems of association rules mining of massive database, an effective parallel approach for the closed frequent itemsets mining based on the division of equivalence classes was presented. Under the framework of MapReduce, the proposed approach performs through three steps:1) the division of equivalence class, 2) the allocation of data set, and 3) the asynchronous mining and aggregation of frequent closed itemsets. Such a strategy can significantly solve the load balancing problem of multiple nodes and obtain the reliable frequent closed itemsets. Experimental results showed that the approach can effectively overcome the drawbacks of traditional approaches such as low efficiency of mining, more redundant rules and so on, and gain higher performance.