- Original Paper
- Open access
- Published:
Learning to cooperate in the Iterated Prisoner’s Dilemma by means of social attachments
Journal of the Brazilian Computer Society volume 17, pages 163–174 (2011)
Abstract
The Iterated Prisoner’s Dilemma (IPD) has been used as a paradigm for studying the emergence of cooperation among individual agents. Many computer experiments show that cooperation does arise under certain conditions. In particular, the spatial version of the IPD has been used and analyzed to understand the role of local interactions in the emergence and maintenance of cooperation. It is known that individual learning leads players to the Nash equilibrium of the game, which means that cooperation is not selected. Therefore, in this paper we propose that when players have social attachment, learning may lead to a certain rate of cooperation. We perform experiments where agents play the spatial IPD considering social relationships such as belonging to a hierarchy or to coalition. Results show that learners end up cooperating, especially when coalitions emerge.
References
Abramson G, Kuperman M (2001) Social games in a social network. Phys Rev E 63
Axelrod R (1984) The evolution of cooperation. Basic Books, New York
Babes M, Cote EMD, Littman ML (2008) Social reward shaping in the prisoner’s dilemma. In: Padgham L, Parkes D, Müller J, Parsons S (eds) Proc. of the 7th int. joint conf. on aut. agents and multiagent systems, IFAAMAS, May 2008, pp 1389–1392
Bazzan ALC, Bordini RH (2001) A framework for the simulation of agents with emotions: Report on experiments with the iterated prisoners dilemma. In: Müller JP, Andre E, Sen S, Frasson C (eds) Proceedings of the fifth international conference on autonomous agents, Montreal, Canada, May 2001. ACM, New York, pp 292–299
Bazzan ALC, Bordini RH, Campbell JA (1999) Moral sentiments in multi-agent systems. In: Intelligent agents V. Lecture notes in artificial intelligence, vol 1555. Springer, Berlin, pp 113–131. Also appeared as Proc. of the workshop on agent theories, architecture and languages (ATAL98), Paris, July 1998
Bazzan ALC, de Oliveira D, da Silva BC (2010) Learning in groups of traffic signals. Eng Appl Artif Intell 23:560–568
Brafman RI, Tennenholtz M (2002) Efficient learning equilibrium. In: NIPS, pp 1603–1610
Claus C, Boutilier C (1998) The dynamics of reinforcement learning in cooperative multiagent systems. In: Proceedings of the fifteenth national conference on artificial intelligence, pp 746–752
Costa-Montenegro E, Burguillo-Rial JC, González-Castaño FJ, Vales-Alonso J (2007) Agent-controlled sharing of distributed resources in user networks. In: Lee RST, Loia V (eds) Computational intelligence for agent-based systems. Studies in computational intelligence, vol 72. Springer, Berlin, pp 29–60
Costa-Montenegro E, Burguillo-Rial JC, Gil-Castiñeira F, González-Castaño FJ (2011) Implementation and analysis of the bittorrent protocol with a multi-agent model. J Netw Comput Appl 34:368–383
Fulda N, Ventura D (2007) Predicting and preventing coordination problems in cooperative Q-learning systems. In: Proceedings of the 20th international joint conference on artificial intelligence (IJCAI), pp 780–785
Hines G, Larson K (2008) Learning when to take advice: A statistical test for achieving a correlated equilibrium. In: McAllester DA, Myllymäki P (eds) UAI. AUAI Press, Menlo Park, pp 274–281
Hu J, Wellman MP (1998) Multiagent reinforcement learning: Theoretical framework and an algorithm. In: Proc. 15th international conf. on machine learning. Kaufmann, Los Altos, pp 242–250
Huberman BA, Glance NS (1993) Evolutionary games and computer simulations. Proc Natl Acad Sci USA 90:7716–7718
Humphrys M (1997) Action selection methods using reinforcement learning. PhD thesis, Cambridge
Kim BJ, Trusina A, Holme P, Minnhagen P, Chung JS, Choi MY (2002) Dynamic instabilities induced by asymmetric influence: Prisoner’s dilemma game in small-world networks. Phys Rev E 66
Kuminov D, Tennenholtz M (2008) As safe as it gets: Near-optimal learning in multi-stage games with imperfect monitoring. In: Proceeding of the ECAI. IOS Press, Amsterdam, pp 438–442
Lin R, Kraus S, Shavitt Y (2007) On the benefits of cheating by self-interested agents in vehicular networks. In: Proceedings of the 6th international joint conference on autonomous agents and multiagent systems (AAMAS 2007). ACM, New York, pp 327–334
Lindgren K, Nordahl M (1994) Evolutionary dynamics of spatial games. Physica D 75:292–309
Littman ML (1994) Markov games as a framework for multi-agent reinforcement learning. In: Proceedings of the 11th international conference on machine learning, ML, New Brunswick, NJ. Kaufmann, Los Altos, pp 157–163
Littman ML (2001) Friend-or-Foe Q-learning in general-sum games. In: Proceedings of the eighteenth international conference on machine learning (ICML01), San Francisco, CA, USA. Kaufmann, Los Altos, pp 322–328
Mailath G, Samuelson L, Shaked A (1993) Correlated equilibria as network equilibria. Discussion paper, University of Bonn
Narendra KS, Thathachar MAL (1989) Learning automata: an introduction. Prentice-Hall, Upper Saddle River
Nowak MA, May RM (1992) Evolutionary games and spatial chaos. Nature 359:826–829
Ortony A, Clore GL, Collins A (1988) The cognitive structure of emotions. Cambridge University Press, Cambridge
Panait L, Luke S (2005) Cooperative multi-agent learning: The state of the art. Auton Agents Multi-Agent Syst 11(3):387–434
Peleteiro A, Burguillo JC, Bazzan ALC (2010) Enhancing cooperation in the ipd with learning and coalitions. In: Proc. of the 2nd Brazilian workshop on social simulation, S. Bernardo do Campo. SBC, Porto Alegre
Sandholm T (2007) Perspectives on multiagent learning. Artif Intell 171(7):382–391
Sandholm TW, Crites RH (1995) Multiagent reinforcement learning in the iterated prisoner’s dilemma. Biosystems 37:147–166
Sandholm T, Larson K, Andersson M, Shehory O, Tohmé F (1999) Coalition structure generation with worst case guarantees. Artif Intell 111(1–2):209–238
Shoham Y, Powers R, Grenager T (2007) If multi-agent learning is the answer, what is the question? Artif Intell 171(7):365–377
Stone P (2007) Multiagent learning is not the answer. It is the question. Artif Intell 171(7):402–405
Stone P, Veloso M (2000) Multiagent systems: A survey from a machine learning perspective. Auton Robots 8(3):345–383
Vinyals M, Rodríguez-Aguilar JA, Cerquides J (2011) A survey on sensor networks from a multiagent perspective. Comput J 54:455–470
Vrancx P, Tuyls K, Westra RL (2008) Switching dynamics of multi-agent learning. In: Padgham L, Parkes D, Müller J, Parsons S (eds) Proceedings of the 7th international joint conference on autonomous agents and multiagent systems, Estoril, vol 1. pp 307–313
Wang X, Sandholm T (2002) Reinforcement learning to play an optimal Nash equilibrium in team Markov games. In: Advances in neural information processing systems (NIPS-2002), vol 15
Watkins CJCH, Dayan P (1992) Q-learning. Mach Learn 8(3):279–292
Zhang C, Abdallah S, Lesser VR (2008) Efficient multi-agent reinforcement learning through automated supervision (extended abstract). In: Padgham L, Parkes D, Müller J, Parsons S (eds) Proceedings of the 7th international joint conference on autonomous agents and multiagent systems, Estoril, vol 3. pp 1365–1368
Zhang C, Abdallah S, Lesser V (2009) Integrating organizational control into multi-agent learning. In: Sichman JS, Decker KS, Sierra C, Castelfranchi C (eds) Proceedings of the 8th international conference on autonomous agents and multiagent systems (AAMAS), Budapest, Hungary
Author information
Authors and Affiliations
Corresponding author
Additional information
A previous version of this paper appeared at BWSS 2010, the Brazilian Symposium on Social Simulation.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Bazzan, A.L.C., Peleteiro, A. & Burguillo, J.C. Learning to cooperate in the Iterated Prisoner’s Dilemma by means of social attachments. J Braz Comput Soc 17, 163–174 (2011). https://doi.org/10.1007/s13173-011-0038-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13173-011-0038-2