An improved parallel association rules algorithm based on MapReduce framework for big data
Association rules mining is one of the most popular and significant issue in data mining and intends to discovery interest relations between variables in database. In our paper, we implemented an improved parallel Apriori algorithm which realized both count and candidate generation steps under MapReduce framework, while existing parallel Apriori algorithm only considered count step. We analyzed the time complexity of our improved parallel algorithm and compared to the original parallel algorithm, which indicates advantages of our algorithm with massive candidate item sets. Based on our experiment result, we proved that our algorithm performs better under big data situation and achieves excellent speedup feature.