The International Arab Journal of Information Technology (IAJIT)


AFTM-Agent Based Fault Tolerance Manager in Cloud Environment

As the number of cloud users are increasing with times, the probability of failures also increases that takes place in any cloud virtual machine. Failures can occur at any point of time in service delivery. There are numerous techniques for reacting proactively towards these failures. In this framework, a service provider is allocated to the user on the basis of ranking of the service provider. This ranking is done by considering parameters such as trust values (calculated by feedback mechanism), check pointing overheads, availability and throughput. Checkpoints are beneficial in triggering save point so that minimal loss of data takes place if any failure occurs. This paper has also compared the proposed framework with Optimal Checkpoints Interval (OCI) framework which is based on triggering checkpoints on constant rates. Results have proven that Agent based Fault Tolerance Manager (AFTM) has 33% to 50% better efficiency results as compared to OCI framework. The results shown in paper demonstrates how better the check pointing overheads, availability and throughput are handled by using AFTM framework. Also, the overheads were reduced to 50% as compared to OCI framework.

[1] Akinwunmi A., Olajubu E., and Aderounmu G., “A Multi-Agent System Approach for Trustworthy Cloud Service Discovery,” Cogent Engineering, vol. 3, no. 1, pp. 1256084, 2016.

[2] Al-Qerem A., Alauthman M., Almomani A., and Gupta B., “IoT Transaction Processing Through Cooperative Concurrency Control on Fog-Cloud Computing Environment,” Soft Computing, vol. 24, no. 8, pp. 5695-5711, 2020.

[3] Amon M., “Adaptive Framework for Reliable Cloud Computing Environment,” IEEE Access, vol. 4, pp. 9469-9478, 2016.

[4] Arockiam L. and Francis G., “FTM-A Middle Layer Architecture for Fault Tolerance in Cloud Computing,” IJCA Special Issue on Issues and Challenges in Networking, Intelligence and Computing Technologies, vol. 2, pp. 12-16, 2012.

[5] Ben-Yehuda O., Schuster A., Sharov A., Silberstein M., and Iosup A., “Expert: Pareto- Efficient Task Replication on Grids and A Cloud,” in Proceedings IEEE 26th International Parallel and Distributed Processing Symposium, Shanghai, pp. 167-178, 2012.

[6] Bilal K., Khalid O., Malik S., Khan M., Khan S., and Zomaya A., Fault Tolerance in the Cloud, Encyclopedia of Cloud Computing, pp. 291-300, 2016.

[7] Calheiros R., Ranjan R., Beloglazov A., De Rose C., and Buyya R., “CloudSim: A Toolkit for Modeling and Simulation of Cloud Computing Environments and Evaluation of Resource Provisioning Algorithms,” Software: Practice and Experience, vol. 41, no. 1, pp. 23-50, 2011.

[8] Cao J., Simonin M., Cooperman G., and Morin C., “Checkpointing as a Service in Heterogeneous Cloud Environments,” in Proceedings of 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, Shenzhen, pp. 61-70, 2015.

[9] Chen M., Ma Y., Song J., Lai C., and Hu B., “Smart Clothing: Connecting Human with Clouds and Big Data for Sustainable Health Monitoring,” Mobile Networks and Applications, vol. 21, no. 5, pp. 825-845, 2016.

[10] Dahiya A. and Gupta B., “A Reputation Score Policy and Bayesian Game Theory Based Incentivized Mechanism for DDOS Attacks Mitigation and Cyber Defense.” Future Generation Computer Systems, vol. 117, pp. 193- 204, 2021.

[11] Damodhar M. and Poojitha S., “An Adaptive Fault Reduction Scheme to Provide Reliable Cloud Computing Environment,” IOSR Journal of Computer Engineering, vol. 19, no. 4, pp. 64- 73, 2017.

[12] Drashansky T., Houstis E., Ramakrishnan N., and Rice J., “Networked Agents for Scientific Computing,” Communications of the ACM, vol. 42, no. 3, pp. 48-ff, 1999.

[13] Durfee E. and Montgomery T., “MICE: A Flexible Test Bed for Intelligent Coordination Experiments,” in Proceedings of the Distributed AI Workshop, pp. 25-40, 1989.

[14] Egwutuoha I., Chen S., Levy D., Selic B., and Calvo R., “A Proactive Fault Tolerance Approach to High Performance Computing (HPC) in the Cloud,” in Proceedings of 2nd International Conference on Cloud and Green Computing, Xiangtan pp. 268-273, 2012.

[15] Gómez A., Carril L., Valin R., Mouriño J., and Cotelo C., “Fault-Tolerant Virtual Cluster Experiments on Federated Sites using BonFIRE,” Future Generation Computer Systems, vol. 34, pp. 17-25, 2014.

[16] Hassan H., El-Desouky A., Ibrahim A., El- Kenawy E., and Arnous R., “Enhanced QoS- based Model for Trust Assessment in Cloud Computing Environment,” IEEE Access, vol. 8, pp. 43752-43763, 2020.

[17] Honavar V., Miller L., and Wong J., “Distributed Knowledge Networks,” in Proceedings of IEEE Information Technology Conference, Information Environment for the Future (Cat. No. 98EX228), Syracuse, pp. 87-90, 1998.

[18] Jararweh Y., Alshara Z., Jarrah M., Kharbutli M., and Alsaleh M., “Teachcloud: a Cloud Computing Educational Toolkit,” International Journal of Cloud Computing vol. 1, no. 2-3, pp. 237-257, 2013.

[19] Jaswal S. and Malhotra M., “AFTTM: Agent- Based Fault Tolerance Trust Mechanism in Cloud Environment,” International Journal of Cloud Applications and Computing, vol. 12, no. 1, pp. 1-12, 2022.

[20] Khosla R. and Dillon T., “Intelligent Hybrid Multi-Agent Architecture for Engineering Complex Systems,” in Proceedings of International Conference on Neural Networks, Houston, pp. 2449-2454, 1997.

[21] Kumar P., Kumar R., Gupta G., and Tripathi R., “A Distributed Framework for Detecting Ddos Attacks in Smart Contract‐Based Blockchain‐IoT Systems by Leveraging Fog Computing,” Transactions on Emerging Telecommunications Technologies, vol. 32, no. 6, pp. e4112, 2021.

[22] Kumar R. and Tripathi R., Blockchain Cybersecurity, Trust and Privacy, Springer, 2020. 402 The International Arab Journal of Information Technology, Vol. 19, No. 3, May 2022

[23] Kumar R. and Tripathi R., “DBTP2SF: A Deep Blockchain‐Based Trustworthy Privacy‐Preserving Secured Framework in Industrial Internet of Things Systems,” Transactions on Emerging Telecommunications Technologies, vol. 32, no. 4, 2021.

[24] Malik S. and Huet F., “Adaptive Fault Tolerance in Real Time Cloud Computing,” in Proceedings of IEEE World Congress on Services, Washington, pp. 280-287, 2011.

[25] Mishra A., Gupta N., and Gupta B., “Defense Mechanisms Against DDoS Attack based on Entropy in SDN-Cloud Using POX Controller,” Telecommunication Systems, vol. 77, no. 1, pp. 47-62, 2021.

[26] Nguyen T. and Desideri J., “Resilience Issues for Application Workflows on Clouds,” in Proceedings of ICNS2012-8th International Conference on Networking and Services, Netherlands pp. 35-42, 2012.

[27] Palaniammal P. and Santhosh R., “Failure Prediction for Scalable Checkpoints in Scientific Workflows Using Replication and Resubmission Task in Cloud Computing,” International Journal of Science, Engineering and Technology Research, vol. 2, no. 4, pp. 985-991, 2013.

[28] Pei X., Wang Y., Ma X., and Xu F., “Repairing Multiple Failures Adaptively with Erasure Codes In Distributed Storage Systems,” Concurrency and Computation: Practice and Experience, vol. 28, no. 5, pp. 1437-1461, 2016.

[29] Singh K., Smallen S., Tilak S., and Saul L., “Failure Analysis and Prediction for the CIPRES Science Gateway,” Concurrency and Computation: Practice and Experience, vol. 28, no. 7, pp. 1971-1981, 2016.

[30] Srimachari P. and Anandharaj G., “An Efficient Protocol Framework Solution for Resource- Constraint Mobile Devices Allocation in Cloud Computing Environments,” International Journal of Computer Science and Engineering Technology, vol. 4, no. 4, pp.119-126, 2017.

[31] Talia D., “Cloud Computing and Software Agents: Towards Cloud Intelligent Services,” WOA, vol. 11, pp. 2-6, 2011.

[32] Wickremasinghe B., Calheiros R., and Buyya R., “Cloudanalyst: A Cloudsim-Based Visual Modeller for Analysing Cloud Computing Environments and Applications,” in Proceedings of 24th IEEE International Conference on Advanced Information Networking and Applications, Perth, pp. 446-452, 2010.

[33] Wooldridge M., an Introduction to Multiagent Systems, John Wiley and Sons, 2009.

[34] Zhang M., Jin H., Shi X., and Wu S., “VirtCFT: A Transparent VM-Level Fault-Tolerant System for Virtual Clusters,” in Proceedings of IEEE 16th International Conference on Parallel and Distributed Systems, Shanghai, pp. 147-154, 2010.

[35] Zhang Y., Zheng Z., and Lyu M., “BFTCloud: A Byzantine Fault Tolerance Framework for Voluntary-Resource Cloud Computing,” IEEE 4th International Conference on Cloud Computing, Washington, pp. 444-451, 2011.

[36] Zheng Z., Zhou T., Lyu M., and King I., “Component Ranking for Fault-Tolerant Cloud Applications,” IEEE Transactions on Services Computing, vol. 5, no. 4, pp. 540-550, 2011.