The International Arab Journal of Information Technology (IAJIT)


Mining Frequent Sequential Rules with An Efficient Parallel Algorithm

Sequential rule mining is one of the most common data mining techniques. It intends to find desired rules in large sequence databases. It can decide the essential information that helps acquire knowledge from large search spaces and select curiously rules from sequence databases. The key challenge is to avoid wasting time, which is particularly difficult in large sequence databases. This paper studies the mining rules from two representations of sequential patterns to have compact databases without affecting the final result. In addition, execute a parallel approach by utilizing multi core processor architecture for mining non-redundant sequential rules. Also, perform pruning techniques to enhance the efficiency of the generated rules. The evaluation of the proposed algorithm was accomplished by comparing it with another non-redundant sequential rule algorithm called Non-Redundant with Dynamic Bit Vector (NRD-DBV). Both algorithms were performed on four real datasets with different characteristics. Our experiments show the performance of the

[1] Alja’am J., El Saddik A., and Sadka A., Recent Trends in Computer Applications: Best Studies from the 2017 International Conference on Computer and Applications, Springer, 2018.

[2] Czarnul P., Proficz J., and Drypczewski K., “Survey of Methodologies, Approaches, and Challenges in Parallel Programming Using High- Performance Computing Systems,” Scientific Programming, vol. 2020, 2020.

[3] Fournier-Viger P., Lin J., Gomariz A., Gueniche T., Soltani A., Deng Z., and Lam H., “The SPMF Open-Source Data Mining Library Version 2,” in Proceedings of in Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Riva del Garda, pp. 36-40, 2016.

[4] Fournier-Viger P., Lin C., Rage U., Koh Y., and Thomas R., “A Survey of Sequential Pattern Mining,” Data Science and Pattern Recognition, vol. 1, no. 1, pp. 54-77, 2017.

[5] Gan W., Lin J., Fournier-Viger P., Chao H., and Yu P., “A Survey of Parallel Sequential Pattern Mining,” ACM Transactions on Knowledge Discovery from Data, vol. 13, no. 3, pp. 1-34. 2019.

[6] Gokulapriya R. and Kumar G., “Research Aligned Analysis on Web Access Behavioral Pattern Mining for User Identification,” International Journal of Engineering and Advanced Technology, vol. 8, no. 6, pp. 2249- 8958, 2019.

[7] Guyet T., “Enhancing Sequential Pattern Mining with Time and Reasoning,” Doctoral 119 Mining Frequent Sequential Rules with An Efficient Parallel Algorithm Dissertation, Université de Rennes 1, 2020.

[8] Husák M., Kašpar J., Bou-Harb E., and Čeleda P., “On the Sequential Pattern and Rule Mining in The Analysis of Cyber Security Alerts,” in Proceedings of the 12th International Conference on Availability, Reliability and Security, New York, pp. 1-10, 2017.

[9] Huynh B., Vo B., and Snasel V., “An Efficient Method for Mining Frequent Sequential Patterns Using Multi-Core Processors,” Applied Intelligence, vol. 46, no. 3, pp. 703-716, 2017.

[10] Huynh B., Vo B., and Snasel V., “An Efficient Parallel Method for Mining Frequent Closed Sequential Patterns,” IEEE Access, vol. 5, pp. 17392-17402, 2017.

[11] Jamsheela O. and Gopalakrishna R., “Parallelization of Frequent Itemset Mining Methods with FP-tree: An Experiment with PrePost+Algorithm,” The International Arab Journal of Information Technology, vol. 18, no. 2, pp. 208-213, 2021.

[12] Kuriakose S. and Nedunchezhian R., “Efficient Adaptive Frequent Pattern Mining Techniques for Market Analysis in Sequential and Parallel Systems,” The International Arab Journal of Information Technology., vol. 14, no. 2, pp. 175- 185, 2017.

[13] Le B., Huynh U., and Dinh D., “A Pure Array Structure and Parallel Strategy for High-Utility Sequential Pattern Mining, ” Expert Systems with Applications, vol. 104, pp. 107-120, 2018.

[14] Mollenhauer D. and Atzmueller M., “Sequential Exceptional Pattern Discovery Using Pattern- Growth: an Extensible Framework for Interpretable Machine Learning on Sequential Data,” In XI-ML@ KI, 2020.

[15] Mukhlash, I., Mohammad I., and Astuti H., Sutikno S., “Performance Enhancement Of Cbs Algorithm Using Fsgp and Feat Algorithm,” Journal of Theoretical and Applied Information Technology, vol. 67, no. 3, pp. 644-651, 2014.

[16] Naseera R. and Malsoru V., “Domain Specific Performance Evaluation of Sequential Pattern Mining Approaches,” in Proceedings of the World Congress on Engineering, London, 2016.

[17] Patel P. and Malviya M., “A Review of Modern Sequential Rule Mining Techniques,” International Journal of Computer Applications, vol. 88, no. 6, pp. 32-35, 2014.

[18] Pham T., Luo J., Hong T., and Vo B., “MSGPs: a Novel Algorithm for Mining Sequential Generator Patterns,” in Proceedings of the 4th International Conference on Computational Collective Intelligence: Technologies and Applications, Ho Chi Minh City, pp. 393-401, 2012.

[19] Pham T., Luo, J., Hong T., and Vo B., “An Efficient Method for Mining Non-Redundant Sequential Rules Using Attributed Prefix-Trees,” Engineering Applications of Artificial Intelligence, vol. 32, pp. 88-99, 2014.

[20] Ravikumar P., Likhitha P., Raj B., Kiran R., Watanobe Y., and Zettsu K., “Efficient Discovery of Periodic-Frequent Patterns in Columnar Temporal Databases,” Electronics, vol. 10, no. 12, pp. 1-20, 2021.

[21] Spiliopoulou M., Managing Interesting Rules in Sequence Mining,” in Proceedings of the 3rd European Conference on Principles of Data Mining and Knowledge Discovery, Prague, pp. 554-560, 1999.

[22] Suresh Kumar N. and Thangamani M., “Parallel Semi‐Supervised Enhanced fuzzy Co‐Clustering (PSEFC) and Rapid Association Rule Mining (RARM) Based Frequent Route Mining Algorithm for Travel Sequence Recommendation on Big Social Media,” Concurrency and Computation: Practice and Experience, vol. 31, no. 14, pp. e4837, 2019.

[23] Tang K., Dai C., and Chen L., “An Efficient Mining Algorithm by Bit Vector Table for Frequent Closed Itemsets,” Journal of Software, vol. 6, no. 11, pp. 2121-2128, 2011.

[24] Taşer P., Birant K., and Birant D., “Multitask- Based Association Rule Mining,” Turkish Journal of Electrical Engineering and Computer Sciences, vol. 28, no. 2, pp. 933-955, 2020.

[25] Titarenko S., Titarenko V., Aivaliotis G., and Palczewski J., “Fast Implementation of Pattern Mining Algorithms with Time Stamp Uncertainties and Temporal Constraints,” Journal of Big Data, vol. 6, no. 1, pp. 1-34, 2019.

[26] Upadhyay P., Pandey M., and Kohli N., “A Comprehensive Survey of Pattern Mining: Challenges and Opportunities,” International Journal of Computer Applications, vol. 180, no. 24, pp. 32-39, 2018.

[27] Van T., Vo B., and Le B., “IMSR_PreTree: An Improved Algorithm for Mining Sequential Rules Based on the Prefix-Tree,” Vietnam Journal of Computer Science, vol. 1, no. 2, pp. 97-105, 2014.

[28] Veroneze R., Corbi S., Da Silva B., De S-Rocha., C., Maurer-Morelli C., Orrico S., Cirelli J., Von Zuben F., and Scarel-Caminaga R., “Using Association Rule Mining to Jointly Detect Clinical Features and Differentially Expressed Genes Related to Chronic Inflammatory Diseases,” PloS one, vol. 15, no. 10, pp. e0240269, 2020.

[29] Wang W. and Cao L., “VM-NSP: Vertical Negative Sequential Pattern Mining with Loose Negative Element Constraints,” ACM Transactions on Information Systems, vol. 39, no. 2, pp. 1-27, 2021.

[30] Wu Y., Zhu C., Li Y., Guo L., and Wu X., The International Arab Journal of Information Technology, Vol. 19, No. 1, January 2022120 “NetNCSP: Nonoverlapping Closed Sequential Pattern Mining,” Knowledge-Based Systems, vol. 196, pp. 105812, 2020.

[31] Xie M. and Tan L., “An Efficient Algorithm for Frequent Pattern Mining over Uncertain Data Stream,” in Proceedings of 12th International Symposium on Computational Intelligence and Design, Hangzhou, pp. 84-88, 2019.

[32] Youssef N., Abdulkader H., and Abdelwahab A., “Evaluating Non-Redundant Rules of Various Sequential Rule Mining Algorithms,” in Proceedings of International Conference on Advanced Intelligent Systems and Informatics, Cairo, pp. 429-440, 2020,

[33] Zaki M., “SPADE: An Efficient Algorithm for Mining Frequent Sequences,” Machine Learning, vol. 42, no. 1, pp. 31-60, 2001.

[34] Zhou S., Liu H., Chen B., Hou W., Ji X., Zhang Y., Chang W., and Xiao Y., “Status Set Sequential Pattern Mining Considering Time Windows and Periodic Analysis of Patterns,” Entropy, vol. 23, no. 6, pp. 738, 2021.

[35] Zihayat M., Hut Z., An A., and Hut Y., “Distributed and Parallel High Utility Sequential Pattern Mining,” in Proceedings of IEEE International Conference on Big Data, Washington, pp. 853-862, 2016.