..............................
            ..............................
            ..............................
            
Reliability-Aware: Task Scheduling in Cloud Computing Using Multi-Agent Reinforcement
        
        Cloud  computing  becomes  the  basic  alternative  platform  for  the  most  users  application  in  the  recent  years.  The 
complexity  increasing  in  cloud  environment  due  to  the  continuous  development  of  resources  and  applications  needs  a 
concentrated integrated fault tolerance approach to provide  the quality of service. Focusing on reliability  enhancement in an 
environment  with  dynamic  changes  such  as  cloud  environment,  we  developed  a  multi-agent  scheduler  using  Reinforcement 
Learning  (RL)  algorithm  and  Neural  Fitted  Q (NFQ) to  effectively  schedule  the  user  requests.  Our  approach  considers  the 
queue  buffer  size  for  each  resource  by  implementing  the  queue  theory  to  design  a  queue  model  in  a  way  that  each  scheduler 
agent  has  its  own  queue  which  receives  the  user  requests  from  the  global  queue.  A  central  learning  agent  responsible  of 
learning the output of the scheduler agents and direct those scheduler agents through the feedback claimed from the previous 
step. The dynamicity problem in cloud environment is managed in our system by employing neural network which supports the 
reinforcement  learning  algorithm  through  a  specified  function.  The  numerical  result  demonstrated  an  efficiency  of  our 
proposed approach and enhanced the reliability.    
            [1] Abdallah S. and Lesser V., “Learning the Task Allocation Game,” in Proceedings of the 5th International Joint Conference on Autonomous Agents and Multiagent Systems, Hakodate, pp. 850-857, 2006.
[2] Bahrpeyma F., Golchin B., and Cranganu C., “Fast Fuzzy Modeling Method to Estimate Missing Logsin Hydrocarbon Reservoirs,” Journal of Petroleum Science and Engineering, vol. 112, pp. 310-321, 2013.
[3] Bahrpeyma F., Zakerolhoseini A., and Haghighi H., “Using IDS Fitted Q to Develop A Real-Time Adaptive Controller for Dynamic Resource Provisioning in Cloud’s Virtualized Environment,” Applied Soft Computing, vol. 26, pp. 285-298, 2015.
[4] Bu X., Rao J., and Xu C., “A Reinforcement Learning Approach to Online Web Systems Auto- Configuration,” in Proceedings of International Conference on Distributed Computing Systems, Montreal, pp. 2-11, 2009.
[5] Chang R., Chang J., and Lin P., “An Ant Algorithm for Balanced Job Scheduling in Grids,” Future Generation Computer Systems, vol. 25, no. 1, pp. 20-27, 2009.
[6] Ayyapazham R. and Velautham K., “Proficient Decision Making on Virtual Machine Creation in IaaS Cloud Environment,” The International Arab Journal of Information Technology, vol. 14, no. 3, pp. 314-323, 2017.
[7] Cirne W. and Berman F., “When the Herd is Smart: Aggregate Behavior in the Selection of Job Request,” IEEE Transactions on Parallel and Distributed Systems, vol. 14, no. 2, pp. 181-192, 2003.
[8] Farahnakian F., Liljeberg P., and Plosila J., “Energy-Efficient Virtual Machines Consolidation in Cloud Data Centers Using Reinforcement Learning,” in Proceedings of 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, Torino, pp. 500- 507, 2014.
[9] García S., Prado R., and Expósito J., “Fuzzy Scheduling With Swarm Intelligence-Based Knowledge Acquisition for Grid Computing,” Engineering Applications of Artificial Intelligence, vol. 25, no. 2, pp. 359-375, 2012.
[10] Gabel T., Lutz C., and Riedmiller M., “Improved Neural Fitted Q Iteration Applied to A Novel Computer Gaming and Learning Benchmark,” in Proceedings of Symposium on Adaptive Dynamic Programming and Reinforcement Learning, Paris, pp. 279-286, 2011.
[11] Guo Y., Lama P., Jiang C., and Zhou X., “Automated and Agile Server Parametertuning by Coordinated Learning and Control,” Transactions on Parallel and Distributed Systems, vol. 25, no. 4, pp. 876-886, 2014.
[12] Huang Y., Bessis N., Norrington P., Kuonen P., and Hirsbrunner B., “Exploring Decentralized Dynamic Scheduling for Grids and Clouds Using The Community-Aware Scheduling Algorithm,” Future Generation Computer Systems, vol. 29, no. 1, pp. 402-415, 2013.
[13] Hussin M., Hamid N., and Kasmiran K., “Improving Reliability in Resource Management through Adaptive Reinforcement Learning for Distributed Systems,” Journal of parallel and Distributed Computing, vol. 75, pp. 93-100, 2015.
[14] Ilg W., Berns K., Mühlfriedel T., and Dillmann R., “Hybrid Learning Concepts Based on Self- Organizing Neural Networks for Adaptive Control of Walking Machines,” Robotics and Autonomous 46 The International Arab Journal of Information Technology, Vol. 18, No. 1, January 2021 Systems, vol. 22, no. 3-4, pp. 317-327, 1997.
[15] Khan S., Herrmann G., Lewis F., Pipe T., and Melhuish C., “Reinforcement Learning and Optimal Adaptive Control: An Overview and Implementation Examples,” Annual Reviews in Control, vol. 36, no. 1, pp. 42-59, 2012.
[16] Khazaei H., Misic J., and Misic V., “A Fine- Grained Performance Model of Cloud Computing Centers,” Transactions on parallel and distributed systems, vol. 24, no. 11, pp. 2138-2147, 2013.
[17] Krauter K., Buyya R., and Maheswaran M., “A Taxonomy and Survey of Grid Resource Management Systems for Distributed Computing,” Software: Practice and Experience, vol. 32, no. 2, pp. 135-164, 2002.
[18] Lin Y. and Li X., “Reinforcement Learning Based on Local State Feature Learning and Policy Adjustment,” Information Sciences, vol. 154, no. 1-2, pp. 59-70, 2003.
[19] Liu X., Tong W., Zhi X., ZhiRen F., and WenZhao L., “Performance Analysis of Cloud Computing Services Considering Resources Sharing Among Virtual Machines,” The Journal of Supercomputing, vol. 69, no. 1, pp. 357-374, 2014.
[20] Llorente I., Moreno R., and Montero R., “Cloud Computing for On-Demand Grid Resource Provisioning,” Advances in Parallel Computing, vol. 18, pp. 177-191, 2009.
[21] Pauli S., Kohler M., and Arbenz P., “A fault tolerant implementation of Multi-Level Monte Carlo methods,” Parallel computing: Accelerating computational science and Engineering, vol. 25, pp. 471-480, 2014.
[22] Riedmiller M., “Neural Fitted Q Iteration-First Experiences with A Data Efficient Neural Reinforcement Learning Method,” in Proceedings of European Conference on Machine Learning, Porto, pp. 317-328, 2005.
[23] Rizvandi N., Taheri J., Moraveji R., and Zomaya A., “A Study on Using Uncertain Time Series Matching Algorithms for Mapreduce Applications,” Concurrency and Computation: Practice and Experience, vol. 25, no. 12, pp. 1699-1718, 2013.
[24] Sutton R. and Barto A., Reinforcement Learning: An introduction, MIT press, 2018.
[25] Tesauro G., “Practical Issues in Temporal Difference Learning,” Machine Learning, vol. 8, pp. 257-277, 1992.
[26] Tesauro G., Jong N., Das R., and Bennani M., “A Hybrid Reinforcement Learning Approach to Autonomic Resource Allocation,” in Proceedings of the 3rd International Conference on Autonomic Computing, Dublin, pp. 65-73, 2006.
[27] Vengerov D., “A Reinforcement Learning Approach to Dynamic Resource Allocation,” Engineering Applications of Artificial Intelligence, vol. 20, no. 3, pp. 383-390, 2007.
[28] Vishkin U., Caragea G., and Lee B., Models, Algorithms and Applications, Chapter Models for Advancing PRAM and Other Algorithms Into Parallel Programs for A PRAM-On-Chip Platform, Handbook of Parallel Computing CRC Press, 2006.
[29] Watkins C., “Learning from Delayed Rewards,” PhD Thesis, King’s College, 1989.
[30] Wu J., Xu X., Zhang P., and Liu C., “A Novel Multi-Agent Reinforcement Learning Approach for Job Scheduling in Grid Computing,” Future Generation Computer Systems, vol. 27, no. 5, pp. 430-439, 2011.
[31] Xhafa F. and Abraham A., “Computational Models and Heuristic Methods for Grid Scheduling Problems,” Future Generation Computer Systems, vol. 26, no. 4, pp. 608-621, 2010.
[32] Zhang C., Lesser V., and Shenoy P., “A Multi- Agent Learning Approach to Resource Sharing Across Computing Clusters,” UMass Computer Science Technical Report, University of Massachusetts Amherst, 2008.
[33] Zheng Q., Yang H., and Sun Y., “How to Avoid Herd: A Novel Stochastic Algorithm in Grid Scheduling,” in Proceedings of 15th IEEE International Conference on High Performance Distributed Computing, Paris, pp. 267-278, 2006. Reliability-Aware: Task Scheduling in Cloud Computing Using Multi-Agent ... 47 Husamelddin Balla Received his MSc in Computer Science from Harbin Institute of Technology. He is a research scholar at Northeast Forestry University. His research interests include cloud computing, machine learning and natural language processing. Chen Sheng is currently a Doctoral Supervisor and a Professor with Northeast Forestry University, China. He is also a member of the National Innovation Methods Research Institute and the Executive Director of the Education Information Technology Council of Education Ministry. He has published over 30 academic papers and one monograph. His research interests include biomass material prediction, intelligent detection of new composite materials, and big data on forestry. Jing Weipeng received the Ph.D. degree from the Harbin Institute of Technology, China. He is currently an Associate Professor with Northeast Forestry University, China. He has published over 50 research articles in refereed journals and conference proceedings, such as CPC, PUC, and FGCS. His research interests include modeling and scheduling for distributed computing systems, fault tolerant computing and system reliability, cloud computing, and spatial data mining.
