Agent Enhancement using Deep Reinforcement Learning Algorithms for Multiplayer game (Slither.io)
##plugins.themes.academic_pro.article.main##
Abstract
Developing self learning model for a game is challenging as the environment keeps changing all the time and therefore require highly intelligent models which can make decisions depending on the environment in real time. The agent has to learn the environment and takes action based on the inference. Based on the action, a positive or negative reward is given to the agent. The agent again learns from the reward and enhances / trains itself to behave better in the environment. This work aims to train an agent using deep reinforcement learning algorithms to play a multiplayer online game like SLITHER.IO. We use an OpenAI Universe environment to collect raw image inputs from sample gaming as training data. Agent learns the current state of the environment and the position of the other players (snakes). Then it takes action in the form of direction of its movement. To compare our model to other existing systems and random policy, we propose to use deep Q-learning and other actor critic approaches such as Proximal Policy Optimisation (PPO) with reward shaping and replay buffer. Out of all these algorithms the PPO agent shows significant improvement in the score over a range of episodes. PPO agent learns quickly and its reward progression is higher when compared to other techniques.
##plugins.themes.academic_pro.article.details##
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
References
- Andersen, P. A., Goodwin, M., and Granmo, O. C. 2018. Deep rts: a game environment for deep reinforcement learning in real-time strategy games. In 2018 IEEE conference on computational intelligence and games (CIG). IEEE, 1–8. DOI: https://doi.org/10.1109/CIG.2018.8490409
- Brockman, G. C., Pettersson, V., Schneider, L., Schulman, J., J., J. T., and Zaremba, w. . 2016. ‘OpenAI Gym’, . arxiv. preprint.
- Campbell, R. H., Czechowski, K., Erhan, D., Finn, C., Kozakowski, P., Levine, S., Mohiuddin, A., Sepassi, R., Tucker, G., and Michalewski, H. 2019. Model based reinforcement learning for atari. In Proceedings of the International Conference on Learning Representations. NewOrleans, LA, USA, 6–9.
- Caudill, J. 2017. ‘slither.io deep learning bot’,. Digitalcommons Calpoly, 1–10.
- Chan, S. F., Canny, S., and Korattikara, J. 2020. A. and guadarrama s. In ‘Measuring the Reliability of Reinforcement Learning Algorithms’, I. Conference, Ed. on Learning Representations, 1–36.
- Creus-Costa, J. and Fang, Z. 2018. Learning to play SLITHER IO with deep reinforcement learning . CS 229.
- Hester, T., Vecerik, M., Pietquin, O., Lanctot, M., Schaul, T., Piot, B., Horgan, D., et al. 2018. Deep q-learning from demonstrations. In Deep q-learning from demonstrations, I. Proceedings, Ed. of the AAAI conference on artificial intelligence, vol. 32, no. 1. DOI: https://doi.org/10.1609/aaai.v32i1.11757
- Jordan, S. C., Cohen, Y., and Zhang, D. 2020. M. and thomas. In Evaluating the Performance of Reinforcement Learning Algorithms. Thirty-seventh International Conference on Machine Learning (ICML, 4962–4973.
- Konda, V. and Tsitsiklis, J. 1999. Actor-critic algorithms. Advances in neural information processing systems 12.
- Miller, M., Washburn, M., and Khosmood, F. 2019. Evolving unsupervised neural networks for slither. io. In Proceedings of the 14th International Conference on the Foundations of Digital Games. 1–5. DOI: https://doi.org/10.1145/3337722.3341837
- Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., and Hassabis, D. 2015. Human-level control through deep reinforcement learning. nature 518, 7540, 529–533. DOI: https://doi.org/10.1038/nature14236
- Mnih, V. K., Silver, K., Graves, D., Antonoglou, A., D., I. W., and M, R. 2013. Playing Atari with Deep Reinforcement Learning’. NIPS Deep Learning Workshop.
- Schulman, J. W., Dhariwal, F., A., P. R., and O, K. 2017. ‘Proximal Policy Optimization Algorithms’. Open AI.
- Wingqvist, D., Wickstr”om, F., and Memeti, S. 2022. Evaluating the performance of object-oriented and data-oriented design with multi-threading in game development. In IEEE Games, Entertainment, Media Conference (GEM) (). IEEE 2022, 1–6. DOI: https://doi.org/10.1109/GEM56474.2022.10017610
- Yang, A. Z., Hassan, S., Zou, Y., and Hassan, A. E. 2022. An empirical study on release notes patterns of popular apps in the google play store. Empirical Software Engineering 27, 2, 55. DOI: https://doi.org/10.1007/s10664-021-10086-2
- Zanette, A., Wainwright, M. J., and Brunskill, E. 2021. Provable benefits of actorcritic methods for offline reinforcement learning. Advances in neural information processing systems 34, 13626–13640.
- Zhang, Y., Li, Z., Cao, Y., Zhao, X., and Cao, J. 2023. Deep Reinforcement Learning Using Optimized Monte Carlo Tree Search in EWN. IEEE Transactions on Games. DOI: https://doi.org/10.1109/TG.2023.3308898