Parallel miner

Lin, J.C.-W., Ren, S., Fournier-Viger, P.: MEMU: more efficient algorithm to mine high average-utility patterns with multiple minimum average-utility thresholds. Lin, J.C.-W., Li, T., Fournier-Viger, P., Hong, T.-P., Zhan, J., Voznak, M.: An efficient algorithm to mine high average-utility itemsets. Li, Y.-C., Yeh, J.-S., Chang, C.-C.: Isolated items discarding strategy for discovering high utility itemsets.

Lan, G.-C., Hong, T.-P., Tseng, V.S., et al.: A projection-based approach for discovering high average-utility itemsets. Lan, G.-C., Hong, T.-P., Tseng, V.S.: Efficiently mining high average-utility itemsets with an improved upper-bound strategy. Krishnamoorthy, S.: HMiner: efficiently mining high utility itemsets.

Krishnamoorthy, S.: Pruning strategies for mining high utility itemsets. Hong, T.-P., Lee, C.-H., Wang, S.-L.: Effective utility mining with the measure of average utility. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. Han, J., Pei, J., Kamber, M.: Data Mining: Concepts and Techniques. In: Andreasen, T., Christiansen, H., Cubero, J.-C., Raś, Z.W. 7(4), e1207 (2017)įournier-Viger, P., Wu, C.-W., Zida, S., Tseng, V.S.: FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning. 6, 26–42 (2016)įournier-Viger, P., Gomariz, A., Gueniche, T., Soltani, A., Cheng-Wei, W., Tseng, V.S.: SPMF: a Java open-source pattern mining library. 275, 314–347 (2014)Ĭhen, Y., An, A.: Approximate parallel high utility itemset mining. IEEE (2003)Ĭhen, C.L.P., Zhang, C.-Y.: Data-intensive applications, challenges, techniques and technologies: a survey on big data. In: 2003 Third IEEE international conference on Data mining, ICDM 2003, pp. In: Proceedings of the 20th International Conference on Very Large Data Bases, VLDB, vol. 22, 207–216 (1993)Īgrawal, R., Srikant, R.: Fast algorithms for mining association rules. KeywordsĪgrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. PHAUIM is also compared with traditional HAUIM. Comprehensive experiments have been performed to measure the performance of PHAUIM in terms of speedup and data scalability. Proposed search space division technique fairly assigns the workload to each node and upgrades the performance. In addition, an improved approach for search space division is developed. PHAUIM is a Spark-based distributed algorithm which splits the dataset into multiple chunks and distributes on cluster nodes to process each data chunk in parallel. This paper presents a parallel version of the traditional HAUI-Miner algorithm and names it as Parallel High-Average Utility Itemset Miner (PHAUIM).

Therefore, several distributed frameworks have been developed to process big data on cluster of commodity hardwares.

In the era of big data, traditional HAUI mining algorithms are not suitable to process large transaction dataset on standalone system due to limitation of processing resources. High Average-Utility Itemset (HAUI) mining is an improvement over HUI mining that involves the length of items to refine the patterns and keep a fair mining process. HUI mining discovers a set of itemset with their profit more than a user defined profit threshold. Since the last decade, High Utility Itemset (HUI) mining has emerged as a popular pattern mining approach.