
AD-DCFP: Anomaly Detection Based on the Distance of Closed Frequent Patterns
Frequent Pattern-based (FP) anomaly detection methods can accurately detect the potential anomalies since they fully consider the appearing frequency as well as the deviating degree of each data sample, which is coincide with the definition of anomalies. Because the Closed Frequent Patterns (CFPs) are the subsets of FPs and its scale is much less, thus, CFP-based Anomaly Detection (AD) methods are more efficient in time. However, the small scale of patterns used in the AD process led to low detection efficiency. That is, the time efficiency and detection accuracy of FP-based anomaly detection are two contradictory individuals. Aimed at this problem, this paper introduces an AD method based on the distance of CFPs, namely Anomaly Detection Based on the Distance of Closed Frequent Patterns (AD-DCFP). AD-DCFP uses the distance of CFPs (the discrepancy between CFPs and data samples) to eliminate the negative impact of patterns with small scale used in the AD, thereby quickly and accurately detecting anomalies. Specifically, the vertical-based mining manner and bit-vector structure are used to mine CFPs for improving mining efficiency; and then, the concept of pattern distance is introduced in the AD phase to calculate the abnormal degree of each data sample; Finally, the data samples with top-k ranked abnormal degree are judged as anomalies. Massive experiments on six datasets show that compared with five state-of-the-arts, the proposed AD-DCFP method can improve the average detection accuracy by about 5% and reduce the time consumption by about 10%, it is a better choice for large-scale or high-dimensional datasets.
[1] Angiulli F. and Fassetti F., “Uncertain Distance- Based Outlier Detection with Arbitrarily Shaped Data Objects,” Journal of Intelligent Information Systems, vol. 57, no. 1, pp. 1-24, 2021. https://link.springer.com/article/10.1007/s10844- 020-00624-7
[2] Arias L., Oosterlee C., and Cirillo P., “AIDA: Analytic Isolation and Distance-based Anomaly Detection Algorithm,” Pattern Recognition, vol. 141, pp. 109607, 2023. https://doi.org/10.1016/j.patcog.2023.109607
[3] Boahen E., Bouya-Moko B., and Wang C., “Network Anomaly Detection in a Controlled Environment Based on an Enhanced AD-DCFP: Anomaly Detection Based on the Distance of Closed Frequent Patterns 289 PSOGSARFC,” Computers and Security, vol. 104, pp. 102225, 2021. https://doi.org/10.1016/j.cose.2021.102225
[4] Cai S., Chen J., Chen H., Zhang C., Li Q., Sosu R., and Yin S., “An Efficient Anomaly Detection Method for Uncertain Data Based on Minimal Rare Patterns with the Consideration of Anti- Monotonic Constraints,” Information Sciences, vol. 580, pp. 620-642, 2021. https://doi.org/10.1016/j.ins.2021.08.097
[5] Cai S., Huang R., Chen J., Zhang C., Liu B., Yin S., and Geng Y., “An Efficient Outlier Detection Method for Data Streams Based on Closed Frequent Patterns by Considering Anti-Monotonic Constraints,” Information Sciences, vol. 555, pp. 125-146, 2021. https://doi.org/10.1016/j.ins.2020.12.050
[6] Cai S., Li L., Chen J., Zhao K., Yuan G., Sun R., Sosu R., and Huang L., “MWFP-Outlier: Maximal Weighted Frequent-Pattern-Based Approach for Detecting Outliers from Uncertain Weighted Data Streams,” Information Sciences, vol. 591, pp. 195- 225, 2023. https://doi.org/10.1016/j.ins.2022.01.028
[7] Cai S., Li L., Li S., Sun R., and Yuan G., “An Efficient Approach for Outlier Detection from Uncertain Data Streams Based on Maximal Frequent Patterns,” Expert Systems with Applications, vol. 160, pp. 113646, 2020. https://doi.org/10.1016/j.eswa.2020.113646
[8] Carcillo F., Borgne Y., Caelen O., Kessaci Y., Oble F., and Bontempi G., “Combining Unsupervised and Supervised Learning in Credit Card Fraud Detection,” Information Sciences, vol. 557, pp. 317-331, 2021. https://doi.org/10.1016/j.ins.2019.05.042
[9] Ghafoori Z., Erfani S., Bezdek J., Karunasekera S., and Leckie C., “LN-SNE: Log-Normal Distributed Stochastic Neighbor Embedding for Anomaly Detection,” IEEE Transactions on Knowledge and Data Engineering, vol. 32, no. 4, pp. 815-820, 2020. DOI:10.1109/TKDE.2019.2934450
[10] He Z., Xu X., Huang J., and Deng S., “FP-Outlier: Frequent Pattern Based Outlier Detection,” Computer Science and Information Systems, vol. 2, no. 1, pp. 103-118, 2005. DOI:10.2298/CSIS0501103H
[11] Huang J., Zhu Q., Yang L., Cheng D., and Wu Q., “A Novel Outlier Cluster Detection Algorithm Without Top-N Parameter,” Knowledge-Based Systems, vol. 121, pp. 32-40, 2017. https://doi.org/10.1016/j.knosys.2017.01.013
[12] Idrissi M., Alami H., Mahdaouy A, Mekki A., Oualil S., Yartaoui Z., and Berrada I., “Fed-Anids: Federated Learning for Anomaly-based Network Intrusion Detection Systems,” Expert Systems with Applications, vol. 234, pp. 121000, 2023. https://doi.org/10.1016/j.eswa.2023.121000
[13] Li J. and Wang R., “An Anomaly Detection Method for Weighted Data Based on Feature Association Analysis,” The International Arab Journal of Information Technology, vol. 21, no. 1, pp. 117-127, 2024. DOI: 10.34028//iajit/21/1/11
[14] Li Z., Zhu Y., and Leeuwen M., “A Survey on Explainable Anomaly Detection,” ACM Transactions on Knowledge Discovery from Data, vol. 18, no. 1, pp. 1-54, 2023. https://doi.org/10.1145/3609333
[15] Lin W., Wang S., Wu W., Li D., and Zomaya A., “HybridAD: A Hybrid Model-Driven Anomaly Detection Approach for Multivariate Time Series,” IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 51, pp. 3290027, 2023. DOI:10.1109/TETCI.2023.3290027
[16] Liu B., Li X., Xiao Y., Sun P., Zhao S., Peng T., Zheng Z., and Huang Y., “Adaboost-Based SVDD for Anomaly Detection with Dictionary Learning,” Expert Systems with Applications, vol. 238, pp. 121770, 2024. https://doi.org/10.1016/j.eswa.2023.121770
[17] Pang G., Cao L., and Chen L., “Outlier Detection in Complex Categorical Data by Modelling the Feature Value Couplings,” in Proceedings of the 25th International Joint Conference on Artificial Intelligence, New York, pp. 1902-1908, 2016. https://ink.library.smu.edu.sg/sis_research/7146/
[18] Peng H., Zhao J., Li L., Ren Y., and Zhao S., “One- Class Adversarial Fraud Detection Nets with Class Specific Representations,” IEEE Transactions on Network Science and Engineering, vol. 10, no. 6, pp. 3793-3803, 2023. DOI:10.1109/TNSE.2023.3273543
[19] Safaei M, Ismail A., Chizari H., Driss M., Boulila W., Asadi S., and Safaei M., “Standalone Noise and Anomaly Detection in Wireless Sensor Networks: A Novel Time-Series and Adaptive Bayesian-Network-based Approach,” Software Practice Experience, vol. 50, no. 4, pp. 428-446, 2020. DOI:10.1002/spe.2785
[20] Xu M., Zhou X., Gao X., He W., and Niu S., “Discriminative Feature Learning Framework with Gradient Preference for Anomaly Detection,” IEEE Transactions on Instrumentation and Measurement, vol. 72, pp. 1-10, 2023. DOI:10.1109/TIM.2022.3228007
[21] Yang X. and Li X., “ATDAD: One-Class Adversarial Learning for Tabular Data Anomaly Detection,” Computers and Security, vol. 134, pp. 103449, 2023. https://doi.org/10.1016/j.cose.2023.103449
[22] Yuan Z., Chen B., Liu J., Chen H., Peng D., and Li P., “Anomaly Detection Based on Weighted Fuzzy-Rough Density,” Applied Soft Computing, vol. 134, pp. 109995, 2023. https://doi.org/10.1016/j.asoc.2023.109995 290 The International Arab Journal of Information Technology, Vol. 22, No. 2, March 2025
[23] Zaki M., “Scalable Algorithms for Association Mining,” IEEE Transactions on Knowledge and Data Engineering, vol. 12, no. 3, pp. 372-390, 2000. DOI:10.1109/69.846291
[24] Zhang L., Lin J., and Karim R., “Adaptive Kernel Density-Based Anomaly Detection for Nonlinear Systems,” Knowledge-Based Systems, vol. 139, pp. 50-63, 2018. https://doi.org/10.1016/j.knosys.2017.10.009
[25] Zou B., Yang K., Kui X., Liu J., Liao S., and Zhao W., “Anomaly Detection for Streaming Data Based on Grid-Clustering and Gaussian Distribution,” Information Sciences, vol. 638, pp. 118989, 2023. https://doi.org/10.1016/j.ins.2023.118989