A Modified DBSCAN Algorithm for Anomaly Detection in Time-series Data with Seasonality
Anomaly detection concerns identifying anomalous observations or patterns that are a deviation from the dataset's expected behaviour. The detection of anomalies has significant and practical applications in several industrial domains such as public health, finance, Information Technology (IT), security, medical, energy, and climate studies. Density-Based Spatial Clustering of Applications with Noise (DBSCAN) Algorithm is a density-based clustering algorithm with the capability of identifying anomalous data. In this paper, a modified DBSCAN algorithm is proposed for anomaly detection in time-series data with seasonality. For experimental evaluation, a monthly temperature dataset was employed and the analysis set forth the advantages of the modified DBSCAN over the standard DBSCAN algorithm for the seasonal datasets. From the result analysis, we may conclude that DBSCAN is used for finding the anomalies in a dataset but fails to find local anomalies in seasonal data. The proposed Modified DBSCAN approach helps to find both the global and local anomalies from the seasonal data. Using normal DBSCAN, we are able to get 19 (2.16%) anomaly points. While using the modified approach for DBSCAN, we are able to get 42 (4.79%) anomaly points. In comparison, we can say that we are able to get 2.21% more anomalies using the modified DBSCAN approach. Hence, the proposed Modified DBSCAN algorithm outperforms in comparison with the DBSCAN algorithm to find local anomalies.
[1] Ahmad S., Lavin A., Purdy S., and Agha Z., “Unsupervised Real-Time Anomaly Detection for Streaming Data,” Neurocomputing, vol. 262, pp. 134-147, 2017.
[2] Akouemo H. and Povinelli R., “Probabilistic Anomaly Detection in Natural Gas Time Series Data,” International Journal of Forecasting, vol. 32, no. 3, pp. 948-956, 2016.
[3] Birant D. and Kut A., “St-dbscan: An algorithm for Clustering Spatial-Temporal Data,” Data and Knowledge Engineering, vol. 60, no. 1, pp. 208- 221, 2007.
[4] Chandola V., Banerjee A., and Kumar V., “Anomaly Detection: A Survey,” ACM Computing Surveys, vol. 41, no. 3, pp. 1-58, 2009.
[5] Chandola V., Mithal V., and Kumar V., “Comparative Evaluation of Anomaly Detection Techniques for Sequence Data,” in Proceedings of 8th IEEE International Conference on Data Mining, Pisa, pp. 743-748. 2008.
[6] Cheng M., Xu Q., Jianming L., Liu W., Li Q., and Wang J., “Ms-Lstm: A Multi-Scale Lstm Model for Bgp Anomaly Detection,” in Proceedings of 24th International Conference on Network Protocols, Singapore, pp. 1-6, 2016.
[7] Devarajan R. and Rao P., “An Efficient Intrusion Detection System by Using Behaviour Profiling and Statistical Approach Model,” The International Arab Journal of Information Technology, vol. 18, no. 1, pp. 114-124, 2021.
[8] Dokuz A., Celik M., and Ecemi A., “Anomaly Detection in Bitcoin Prices Using Dbscan Algorithm,” European Journal of Science and Technology, pp. 436-443, 2020.
[9] Emadi H. and Mazinani S., “A Novel Anomaly Detection Algorithm Using Dbscan and Svm in Wireless Sensor Networks,” Wireless Personal Communications, vol. 98, no. 2, pp. 2025-2035, 2018.
[10] Ester M., Kriegel H., Sander J., and Xu X., “A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise,” in Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland, pp. 226-231, 1996.
[11] Feng C., Li T., and Chana D., “Multi-Level Anomaly Detection in Industrial Control Systems via Package Signatures and Lstm Networks,” in Proceedings of 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, Denver, pp. 261-272, 2017.
[12] Fox A., “Outliers in Time Series,” Journal of the Royal Statistical Society: Series B (Methodological), vol. 34, no. 3, pp. 350-363, 1972.
[13] Gama J., Knowledge Discovery from Data Streams, CRC Press, 2010.
[14] Jain P. and Pamula R., “Two-Step Anomaly Detection Approach Using Clustering Algorithm,” in Proceedings of International Conference on Advanced Computing Networking and Informatics, Springer, pp. 513-520, 2019.
[15] Jain P., Quamer W., and Pamula R., “Electricity Consumption Forecasting Using Time Series Analysis,” in Proceedings of International Conference on Advances in Computing and Data Sciences, Dehradun, pp. 327-335, 2018.
[16] Kalid S., Ng K., Tong G., and Khor K., “A Multiple Classifiers System for Anomaly Detection in Credit Card Data with Unbalanced and Overlapped Classes,” IEEE Access, vol. 8, pp. 28210-28221, 2020.
[17] Laptev N., Amizadeh S., and Flint I., “Generic and Scalable Framework for Automated Time- Series Anomaly Detection,” in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, pp. 1939-1947, 2015.
[18] LeCun Y., Bengio Y., Hinton G., “Deep learning,” Nature, vol. 521, pp. 436-444, 2015.
[19] Lee I. and Lee K., “The Internet of Things (Iot): Applications, Investments, and Challenges for Enterprises,” Business Horizons, vol. 58, no. 4, pp. 431-440, 2015.
[20] Li S., Liu C., and Yang Y., “Anomaly Detection Based on Maximum A Posteriori,” Pattern Recognition Letters, vol. 107, pp. 91-97, 2018.
[21] Liu L., Fan J., Qiao S., Song J., and Guo R., “Efficiently Mining Outliers from Trajectories of Unrestraint Movement,” in Proceedings of 3rd International Conference on Advanced 28 The International Arab Journal of Information Technology, Vol. 19, No. 1, January 2022 Computer Theory and Engineering, Chengdu, 2010.
[22] Marchi E., Vesperini F., Eyben F., Squartini S., and Schuller B., “A Novel Approach for Automatic Acoustic Novelty Detection Using A Denoising Autoencoder with Bidirectional Lstm Neural Networks,” in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, South Brisbane, pp.1996- 2000, 2015.
[23] Pimentel M., Clifton D., Clifton L., and Tarassenko L., “A Review of Novelty Detection,” Signal Processing, vol. 99, 215-249, 2014.
[24] Quinn J. and Sugiyama M., “A Least-Squares Approach to Anomaly Detection in Static and Sequential Data,” Pattern Recognition Letters, vol. 40, pp. 36-40, 2014.
[25] Rajiah P., Fulton N., and Bolen M., “Magnetic Resonance Imaging of the Papillary Muscles of the Left Ventricle: Normal Anatomy, Variants, and Abnormalities,” Insights into Imaging, vol. 10, no. 1, pp. 1-17, 2019.
[26] Ribeiro M., Lazzaretti A., and Lopes H., “A Study of Deep Convolutional Autoencoders for Anomaly Detection in Videos,” Pattern Recognition Letters, vol. 105, pp. 13-22, 2018.
[27] Tan P., Steinbach M., Kumar V., Potter C., Klooster S., and Torregrosa A., “Finding Spatio- Temporal Patterns in Earth Science Data,” in KDD 2001 Workshop on Temporal Data Mining, pp. 1-12, 2001.
[28] Yang C., “Anomaly Network Traffic Detection Algorithm Based on Information Entropy Measurement Under the Cloud Computing Environment,” Cluster Computing, vol. 22, no. 4, pp. 8309-8317, 2019.