The International Arab Journal of Information Technology (IAJIT)

..............................
..............................
..............................


An Effective Reference-Point-Set (RPS) Based Bi-Directional Frequent Itemset Generation

Data Mining (DM) is a combination of several fields that effectively extracts hidden patterns from vast amounts of historical data. One of the DM activities used to produce association rules is Association Rule Mining (ARM). To significantly reduce time and space complexities, the proposed method utilizes an effective bi-directional frequent itemset generation approach. The dataset is explicitly bifurcated into dense and sparse regions in the process of mining frequent itemset. One more feature is proposed in this paper which sensibly predetermines a candidate subset called, Reference-Points-Set (RPS), to reduce the complexities associated with mining of frequent itemsets. The RPS helps to reduce the number of scans over the actual dataset. The novelty is to look at possible candidates during the initial database scans, which can cut down on the number of additional database scans that are required. According to experimental data, the average scan count of the proposed method is respectively, 24% and 65%, lower than that of Dynamic Itemset Counting (DIC) and M-Apriori, across different support counts. The proposed method typically results in a 10% reduction in execution time over DIC and is three times more efficient than M- Apriori. These results significantly outperform those of their predecessors, which strongly supports the proposed approach when creating frequent itemsets from large datasets.

[1] Agrawal R., Imieliński T., and Swami A., “Mining Association Rules Between Sets of Items in Large Databases,” in Proceedings of the ACM SIGMOD International Conference on Management of Data, Washington, pp. 207-216, 1993. An Effective Reference-Point-Set (RPS) Based Bi-Directional Frequent Itemset Generation 897

[2] Agrawal R. and Srikant R., “Fast Algorithms for Mining Association Rules,” in Proceedings of the 20th International Conference on Very Large Data Bases, Chile, pp. 487-499, 1994. https://dl.acm.org/doi/10.5555/645920.672836

[3] Bagui S. and Stanley P., “Mining Frequent Itemsets from Streaming Transaction Data Using Genetic Algorithms,” Journal of Big Data, vol. 54, no. 7, 2020. https://doi.org/10.1186/s40537-020-00330-9

[4] Brin S., Motwani R., Ullman J., and Tsur S., “Dynamic Itemset Counting and Implication Rules for Market Basket Data,” in Proceedings of the ACM SIGMOD International Conference on Management of Data, Tucson, Arizona USA, pp. 255-264, 1997.

[5] Cai S., Hao S., Sun R., and Wu G., “Mining Recent Maximal Frequent Itemsets over Data Streams with Sliding Window,” The International Arab Journal of Information Technology, vol. 16, no. 6, pp. 961-969, 2019. https://www.iajit.org/PDF/November%202019, %20No.%206/15400.pdf

[6] Ceglar A. and Roddick J., “Association Mining,” ACM Computing Surveys, vol. 38, no. 2, pp. 1-42, 2006. https://doi.org/10.1145/1132956.1132958

[7] Chen J. and Xiao K., “BISC: A Bitmap Itemset Support Counting Approach for Efficient Frequent Itemset Mining,” ACM Transactions on Knowledge Discovery from data, vol. 4, no. 3, pp. 1-37 2010. https://doi.org/10.1145/1839490.1839493

[8] Fujioka K. and Shirahama K., “Generic Itemset Mining Based on Reinforcement Learning,” IEEE Access, vol. 10, pp. 5824-5841, 2022. https://doi.org/10.48550/arXiv.2105.07753

[9] Goethals B., “Frequent Itemset Mining Implementations Repository,” http://fimi.uantwerpen.be/data, Last Visited, 2023.

[10] Hamilton H., “Dynamic Itemset Counting and �,�P�S�O�L�F�D�W�L�R�Q� �5�X�O�H�V� �I�R�U� �0�D�U�N�H�W� �%�D�V�N�H�W� �'�D�W�D�´� http://www2..uregina.ca/~dbd/cs831/notes/item sets/DIC.html, Last Visited, 2023.

[11] Han J., Pei J., and Kamber M., Data Mining: Concepts and Techniques, Morgan Kaufmann, 2011. https://www.sciencedirect.com/book/97801238 14791/data-mining-concepts-and-techniques.

[12] Han J., Pei J., Yin Y., and Mao R., “Mining Frequent Patterns without Candidate Generation: a Frequent-Pattern Tree Approach,” Data Mining and Knowledge Discovery, vol. 8, no. 1, pp. 53-87, 2004. https://doi.org/10.1023/B:DAMI.0000005258.3 1418.83

[13] Leeuwen M. and Galbrun E., “Association Discovery in Two-View Data,” IEEE Transactions on Knowledge and Data Engineering, vol. 27, no. 12, pp. 3190-3202, 2015. 10.1109/TKDE.2015.2453159

[14] Lin D. and Kedem Z., “Pincer-Search: An Efficient Algorithm for Discovering the Maximum Frequent Set,” IEEE Transactions on Knowledge and Data Engineering, vol. 14, no. 3, pp. 553-566, 2002. doi: 10.1109/TKDE.2002.1000342

[15] Li F., Meng C., Wang C., and Fan S., “Equipment Quality Information Mining Method Based on Improved Apriori Algorithm,” Journal of Sensors, vol. 2023, 2023. https://doi.org/10.1155/2023/2155590

[16] Magdy M., Ghaleb F., Mohamed D., and Zakaria W., “CC-IFIM: An Efficient Approach for Incremental Frequent Itemset Mining Based on Closed Candidates,” The Journal of Supercomputing, pp. 7877-7899, 2023. https://doi.org/10.1007/s11227-022-04976-5

[17] Maolegi M. and Arkok B., “An Improved Apriori Algorithm for Association Rules,” International Journal on Natural Language Computing, vol. 3, no. 1, 2014. https://doi.org/10.48550/arXiv.1403.3948

[18] Park J., Chen, M., and Yu P., “An Effective Hash Based Algorithm for Mining Association Rules,” ACM SIGMOD Record, vol. 24, no. 2, pp. 175-186, 1995. DOI:10.1145/568271.223813

[19] Phan H. and Le B., “A Novel Algorithm for Frequent Itemsets Mining in Transactional Databases,” Trends and Applications in Knowledge Discovery and Data Mining, vol. 11154, pp. 243-255, 2018. DOI:10.1007/978-3- 319-95786-9_21

[20] Savasere A., Omiecinski E., and Navathe S., “An Efficient Algorithm for Mining Association Rules in Large Databases,” in Proceedings of the 21st International Conference on Very Large Data Bases, San Francisco, pp. 432-443, 1995.

[21] Song W., Yang B., and Xu Z., “Index- BitTableFI: An Improved Algorithm for Mining Frequent Itemsets,” Knowledge Based Systems, vol. 21, no. 6, pp. 507-513, 2008. https://doi.org/10.1016/j.knosys.2008.03.011

[22] Thurachon W. and Kreesuradej W., “Incremental Association Rule Mining with a Fast Incremental Updating Frequent Pattern Growth Algorithm,” IEEE Access, vol. 9, pp. 55726-55741, 2021. 10.1109/ACCESS.2021.3071777

[23] Toivonen H., “Sampling Large Databases for Association Rules,” in Proceedings of the 22nd International Conference on Very Large Data 898 The International Arab Journal of Information Technology, Vol. 20, No. 6, November 2023 Bases, San Francisco, pp. 134-145, 1996. https://dl.acm.org/doi/10.5555/645922.673325

[24] Wang L., Cheung D., Cheng R., Lee S., and Yang X., “Efficient Mining of Frequent Itemsets on Large Uncertain Databases,” IEEE Transactions on Knowledge and Data Engineering, vol. 24, no. 12, pp. 2170-2183, 2012. DOI: 10.1109/TKDE.2011.165.

[25] Webb G. and Vreeken J., “Efficient Discovery of the Most Interesting Associations,” ACM Transactions on Knowledge Discovery from Data, vol. 8, no. 3, pp. 1-31, 2014. https://doi.org/10.1145/2601433

[26] Zaki M., “Scalable Algorithms for Association Mining,” IEEE Transactions on Knowledge and Data Engineering, vol. 12, no. 3, pp. 372-390, 2000. DOI: 10.1109/69.846291

[27] Zhang C., Tian P., Zhang X., Liao Q., and Jiang Z., “HashEclat: An Efficient Frequent Itemset Algorithm,” International Journal of Machine Learn and Cyber, vol. 10, pp. 3003-3016, 2019. https://doi.org/10.1007/s13042-018-00918-x

[28] Zhao Z., Zhou J., Gabu G., Alroobaea R., and Masud M., “An Improved Association Rule Mining Algorithm for Large Data,” Journal of Intelligent Systems, vol. 30, no. 1, pp. 750-762, 2021. DOI:10.1515/jisys-2020-0121