The International Arab Journal of Information Technology (IAJIT)

..............................
..............................
..............................


Mining Recent Maximal Frequent Itemsets Over

The huge number of data streams makes it impossible to mine recent frequent itemsets. Due to the maximal frequent itemsets can perfectly imply all the frequent itemsets and the number is much smaller, therefore, the time cost and the memory usage for mining maximal frequent itemsets are much more efficient. This paper proposes an improved method called Recent Maximal Frequent Itemsets Mining (RMFIsM) to mine recent maximal frequent itemsets over data streams with sliding window. The RMFIsM method uses two matrixes to store the information of data streams, the first matrix stores the information of each transaction and the second one stores the frequent 1-itemsets. The frequent p-itemsets are mined with “extension” process of frequent 2-itemsets, and the maximal frequent itemsets are obtained by deleting the sub-itemsets of long frequent itemsets. Finally, the performance of the RMFIsM method is conducted by a series of experiments, the results show that the proposed RMFIsM method can mine recent maximal frequent itemsets efficiently.


[1] Calders T., Dexters N., Gillis J., and Goethals B., “Mining Frequent Itemsets in A Stream,” Information Systems, vol. 39, pp. 233-255, 2014.

[2] Chang J. and Lee W., “Finding Recent Frequent Itemsets Adaptively Over Online Data Streams,” in Proceedings of 9th International Conference on Knowledge Discovery and Data Mining, Washington, pp. 487-492, 2003.

[3] Deng Z., “Diffnodesets: An Efficient Structure for Fast Mining Frequent Itemsets,” Applied Soft Mining Recent Maximal Frequent Itemsets Over Data Streams with Sliding Window 969 Computing, vol. 41, pp. 214-223, 2016.

[4] Deypir M. and Sadreddini M., “A Dynamic Layout of Sliding Window for Frequent Itemset Mining Over Data Streams,” Journal of Systems and Software, vol. 85, no. 3, pp. 746-759, 2012.

[5] Deypir M., Sadreddini M., and Tarahomi M., “An Efficient Sliding Window Based Algorithm for Adaptive Frequent Itemset Mining over Data Streams,” Journal of Information Science and Engineering, vol. 29, no. 5, pp. 1001-1020, 2013.

[6] Guidan F. and Shaohong Y., “A Frequent Itemsets Mining Algorithm Based on Matrix in Sliding Window Over Data Streams,” in Proceedings of 3rd International Conference on Intelligent System Design and Engineering Applications, Hong Kong, pp. 66-69, 2013.

[7] Han M., Ding J., and Li J., “TDMCS: An Efficient Method for Mining Closed Frequent Patterns over Data Streams Based on Time Decay Model,” The International Arab Journal of Information Technology, vol. 14, no. 6, pp. 851- 860, 2017.

[8] Li H., Lee S., and Shan M., “Online Mining (Recently) Maximal Frequent Item sets Over Data Streams,” in Proceedings of 15th International Workshop on Research Issues in Data Engineering: Stream Data Mining and Applications, Tokyo, pp. 11-18, 2005.

[9] Lin M., Hsueh S., and Wang C., “Interactive Mining of Frequent Patterns in A Data Stream of Time-Fading Models,” in Proceedings of 8th International Conference on Intelligent Systems Design and Applications, Kaohsiung, pp. 513- 518, 2008.

[10] Mao G., Wu X., Zhu X., Chen G., and Liu C., “Mining Maximal Frequent Itemsets From Data Streams,” Journal of Information Science, vol. 33, no. 3, pp. 251-262, 2007.

[11] Nori F., Deypir M., and Sadreddini M., “A Sliding Window Based Algorithm For Frequent Closed Itemset Mining Over Data Streams,” Journal of Systems and Software, vol. 86, no. 3, pp. 615-623, 2013.

[12] Shin S., Lee D., and Lee W., “CP-Tree: An Adaptive Synopsis Structure for Compressing Frequent Itemsets Over Online Data Streams,” Information Sciences, vol. 278, pp. 559-576, 2014.

[13] Yang J., Wei Y., and Zhou F., “An Efficient Algorithm for Mining Maximal Frequent Patterns over Data Streams,” in Proceedings of 7th International Conference on Intelligent Human- Machine Systems and Cybernetics, Hangzhou, pp. 444-447, 2015. Saihua Cai is a Ph.D. student in College of Information and Electrical Engineering, China Agricultural University, China. He received the MS degree from Jiangsu University, China, in 2016. His major research interests include uncertain data management, data mining, outlier detecting and software testing. Shangbo Hao is a Master Student in College of Information and Electrical Engineering, China Agricultural University, China. His research interests include pattern mining and outlier detecting. Ruizhi Sun is a Full Professor in College of Information and Electrical Engineering, China Agricultural University, China. He received his Ph.D. degree in Computer Science and Technology from Tsinghua University, Beijing, China, in 2003. His major research interests include agricultural data acquisition and processing technology, computer network and applications, workflow management and cloud computing. Gang Wu is an associate professor in Secretary of Computer Science Department, Tarim University, China. His research interests mainly involve agriculture information processing technology, data mining, agricultural remote sensing application.