Multicoordinated agreement for groups of agents

Camargos, Lasaro; Schmidt, Rodrigo; Madeira, Edmundo; Pedone, Fernando

doi:10.1007/s13173-010-0001-7

Original Paper
Open access
Published: 23 April 2010

Multicoordinated agreement for groups of agents

Lasaro Camargos¹,
Rodrigo Schmidt²,
Edmundo Madeira¹ &
…
Fernando Pedone³

Journal of the Brazilian Computer Society volume 16, pages 49–68 (2010)Cite this article

610 Accesses
Metrics details

Abstract

Agents in agreement protocols play well-distinct roles. Proposers propose values to the acceptors, which will accept proposals and inform the learners so they detect that an agreement has been reached. A fourth role is that of the coordinator, who filters the proposals from proposers to acceptors. While proposers, learners, and coordinators are easily replaced, substituting an acceptor is prohibitive. Protocols that do not employ a coordinator are less resilient to acceptor failures. Protocols that use one coordinator are more resilient to acceptor failures, at the expense of one extra communication step even in the absence of failures. Moreover, they require replacing the coordinator as soon as it fails, a reconfiguration that, although relatively inexpensive, diminishes the protocol availability. Hence, either option, i.e., one or zero coordinator, has its drawbacks. In previous works, we have presented an alternative: multicoordinated agreement protocols. Such protocols are as resilient as single-coordinated protocols but require less reconfiguration to cope with coordinator failures. In fact, most reconfiguration can be done in parallel to the execution of the protocol’s normal steps. Multicoordination can be applied to several problems. In this paper we exemplify its use in solving consensus and then introduce a fast multicoordinated agreement protocol for agents organized in groups, an abstraction for fast local area networks interconnected by slower links.

References

Aguilera M, Chen W, Toueg S (1998) Failure detection and consensus in the crash–recovery model. In: Proceedings of the 12th international symposium on distributed computing, September 1998
Google Scholar
Amir Y, Danilov C, Dolev D, Kirsch J, Lane J, Nita-Rotaru C, Olsen J, Zage D. Steward: scaling Byzantine fault-tolerant replication to wide area networks. IEEE Trans Depend Secure Comput 99(2):5555
Ben-Or M (1983) Another advantage of free choice (extended abstract): Completely asynchronous agreement protocols. In: PODC ’83: proceedings of the second annual ACM symposium on principles of distributed computing. ACM, New York, pp 27–30
Chapter Google Scholar
Bracha G, Toueg S (1983) Resilient consensus protocols. In: PODC ’83: proceedings of the second annual ACM symposium on principles of distributed computing. ACM, New York, pp 12–26
Chapter Google Scholar
Camargos L, Madeira E, Pedone F (2006) Optimal and practical WAB-based consensus algorithms. In: Euro-Par 2006 parallel processing. Lecture notes in computer science, vol 4128. Springer, Berlin, pp 549–558
Chapter Google Scholar
Camargos L, Pedone F, Schmidt R (2006) A primary-backup protocol for in-memory database replication. In: NCA ’06: proceedings of the fifth IEEE international symposium on network computing and applications. IEEE Computer Society, Washington, pp 204–211
Chapter Google Scholar
Camargos L, Schmidt R, Pedone F (2006) Multicoordinated Paxos. Technical Report 2006/2, EPFL and University of Lugano, 2006
Camargos L, Schmidt R, Pedone F (2007) Multicoordinated Paxos: brief announcement. In: PODC ’07: proceedings of the twenty-sixth annual ACM symposium on principles of distributed computing. ACM, New York, pp 316–317
Chapter Google Scholar
Camargos L, Schmidt R, Pedone F (2008) Multicoordinated agreement protocols for higher availability. In: NCA ’08: proceedings of the seventh IEEE international symposium on network computing and applications. IEEE Computer Society, Washington
Google Scholar
Castro M, Liskov B (1999) Practical Byzantine fault tolerance. In: OSDI ’99: proceedings of the third symposium on operating systems design and implementation. USENIX Association, Berkeley, pp 173–186
Google Scholar
Chandra TD, Hadzilacos V, Toueg S (1996) The weakest failure detector for solving consensus. J ACM 43(4):685–722
Article MathSciNet Google Scholar
Chandra TD, Toueg S (1996) Unreliable failure detectors for reliable distributed systems. Commun ACM 43(2):225–267
MathSciNet Google Scholar
Cristian F, Fetzer C (1999) The timed asynchronous distributed system model. IEEE Trans Parallel Distrib Syst 10(6):642–657
Article Google Scholar
Dolev D, Dwork C, Stockmeyer L (1987) On the minimal synchronism needed for distributed consensus. J ACM 34(1):77–97
Article MathSciNet Google Scholar
Dutta P, Guerraoui R (2002) Fast indulgent consensus with zero degradation. In: Lecture notes in computer science, vol 2485. Springer, Berlin
Google Scholar
Dwork C, Lynch N, Stockmeyer L (1988) Consensus in the presence of partial synchrony. J ACM 35(2):288–323
Article MathSciNet Google Scholar
Fischer M, Lynch N, Paterson M (1985) Impossibility of distributed consensus with one faulty process. J ACM 32(2):374–382
Article MathSciNet Google Scholar
Hurfin M, Mostefaoui A, Raynal M (1998) Consensus in asynchronous systems where processes can crash and recover. In: Proceedings seventeenth IEEE symposium on reliable distributed systems. IEEE Computer Society, Los Alamitos, pp 280–286
Chapter Google Scholar
Hurfin M, Mostéfaoui A, Raynal M (2002) A versatile family of consensus protocols based on Chandra–Toueg’s unreliable failure detectors. IEEE Trans Comput 51(4):395–408
Article Google Scholar
Hurfin M, Raynal M (1999) A simple and fast asynchronous consensus protocol based on a weak failure detector. Distrib Comput 12(4):209–223
Article Google Scholar
Kooh N, Haddad S (1999) Reaching agreement in hierarchical groups. In: Proceedings of the 12th international conference on parallel and distributed computing systems. IASTED Press, Fort Lauderdale
Google Scholar
Lamport L (1978) Time, clocks, and the ordering of events in a distributed system. Commun ACM 21(7):558–565
Article Google Scholar
Lamport L (1998) The part-time parliament. ACM Trans Comput Syst 16(2):133–169
Article Google Scholar
Lamport L (2001) Paxos made simple. ACM SIGACT News 32(4):18–25
Google Scholar
Lamport L (2004) Generalized consensus and Paxos. Technical Report MSR-TR-2005-33, Microsoft Research
Lamport L (2006) Fast Paxos. Distrib Comput 19(2):79–103
Article MathSciNet Google Scholar
Lamport L (2006) Lower bounds for asynchronous consensus. Distrib Comput 19(2):104–125
Article MathSciNet Google Scholar
Lampson B (2001) The abcd’s of Paxos. In: PODC ’01: Proceedings of the twentieth annual ACM symposium on principles of distributed computing. ACM, New York
Google Scholar
Martin JP, Alvisi L (2006) Fast Byzantine consensus. IEEE Trans Dependable Secure Comput 3(3):202–215
Article Google Scholar
Pedone F, Guerraoui R, Schiper A (2003) The database state machine approach. Distrib Parallel Databases 14(1):71–98
Article Google Scholar
Pedone F, Schiper A (1999) Generic broadcast. In: Proceedings of the 13th international symposium on distributed computing (DISC’99, formerly WDAG)
Pedone F, Schiper A (2002) Handling message semantics with generic broadcast protocols. Distrib Comput 15(2):97–107
Article Google Scholar
Pedone F, Schiper A, Urbán P, Cavin D (2002) Solving agreement problems with weak ordering oracles. In: EDCC-4: proceedings of the 4th European dependable computing conference on dependable computing. Springer, London, pp 44–61
Google Scholar
Pedone F, Schiper A, Urbán P, Cavin D (2002) Weak ordering oracles for failure detection-free systems. In: Proceedings of the international conference on dependable systems and networks (DSN), supplemental volume
Rabin MO (1983) Randomized Byzantine generals. In: Proceedings of the 24th annual IEEE symposium on foundations of computer science, pp 403–409
Schiper A (1997) Early consensus in an asynchronous system with a weak failure detector. Distrib Comput 10(3):149–157
Article Google Scholar
Schmidt R, Camargos L, Pedone F (2007) On collision-fast atomic broadcast. Technical report, EPFL
Sousa A, Pereira J, Moura F, Oliveira R (2002) Optimistic total order in wide area networks. In: Proceedings of the 21st IEEE symposium on reliable distributed systems. IEEE Computer Society, New York, pp 190–199
Google Scholar
Vicente P, Rodrigues L (2002) An indulgent uniform total order algorithm with optimistic delivery. In: Proceedings of the 21st symposium on reliable distributed systems, Osaka University, Suita, Japan. IEEE, New York, pp 92–101
Google Scholar
Zielinski P (2004) Paxos at war. Technical Report UCAM-CL-TR-593, University of Cambridge, Computer Laboratory

Download references

Author information

Authors and Affiliations

IC, Institute of Computing UNICAMP, University of Campinas, PO Box 6176, 13083-970, Campinas, SP, Brazil
Lasaro Camargos & Edmundo Madeira
Facebook, 1601 S California Avenue, Palo Alto, CA, 94304, USA
Rodrigo Schmidt
Faculty of Informatics, University of Lugano (USI), Via Giuseppe Buffi 6, 6904, Lugano, Switzerland
Fernando Pedone

Authors

Lasaro Camargos
View author publications
You can also search for this author in PubMed Google Scholar
Rodrigo Schmidt
View author publications
You can also search for this author in PubMed Google Scholar
Edmundo Madeira
View author publications
You can also search for this author in PubMed Google Scholar
Fernando Pedone
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lasaro Camargos.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Camargos, L., Schmidt, R., Madeira, E. et al. Multicoordinated agreement for groups of agents. J Braz Comput Soc 16, 49–68 (2010). https://doi.org/10.1007/s13173-010-0001-7

Download citation

Received: 30 July 2009
Accepted: 01 March 2010
Published: 23 April 2010
Issue Date: May 2010
DOI: https://doi.org/10.1007/s13173-010-0001-7

Multicoordinated agreement for groups of agents

Abstract

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords