Design and Analysis of Distributed Algorithms



Table of Contents


1. DISTRIBUTED COMPUTING ENVIRONMENTS.

   1.1 Entities.

   1.2 Communication.

   1.3 Axioms and Restrictions.

        1.3.1 Axioms.

        1.3.2 Restrictions.

   1.4 Cost and Complexity.

        1.4.1 Amount of Communication Activities.

        1.4.2 Time.

   1.5 An Example: Broadcasting.

   1.6 States and Events.

        1.6.1 Time and Events.

        1.6.2 States and Configurations.

   1.7 Problems and Solutions.

   1.8 Knowledge.

        1.8.1 Levels of Knowledge.

        1.8.2 Types of Knowledge.

   1.9 Technical Considerations.

        1.9.1 Messages.

        1.9.2 Protocol.

        1.9.3 Communication Mechanism.

   1.10 Summary of Definitions.

   1.11 Bibliographical Notes.

   1.12 Exercises, Problems, and Answers.

2. BASIC PROBLEMS AND PROTOCOLS.

   2.1 Broadcast.

        2.1.1 The Problem.

        2.1.2 Cost of Broadcasting.

        2.1.3 Broadcasting in Special Networks.

   2.2 Wake-Up.

        2.2.1 Generic Wake-Up.

        2.2.2 WakeUp in Special Networks.

   2.3 Traversal.

        2.3.1 Depth-First Traversal.

        2.3.2 Hacking.

        2.3.3 Traversal in Special Networks.

        2.3.4 Considerations on Traversal.

   2.4 Practical Implications: Use a Subnet.

   2.5 Constructing a Spanning Tree.

        2.5.1 SPT Construction with a Single Initiator: Shout.

        2.5.2 Other SPT Constructions with Single Initiator.

        2.5.3 Considerations on the Constructed Tree.

        2.5.4 Application: Better Traversal.

        2.5.5 Spanning-Tree Construction with Multiple Initiators.

        2.5.6 Impossibility Result.

        2.5.7 SPT with Initial Distinct Values.

   2.6 Computations in Trees.

        2.6.1 Saturation: A Basic Technique.

        2.6.2 Minimum Finding.

        2.6.3 Distributed Function Evaluation.

        2.6.4 Finding Eccentricities.

        2.6.5 Center Finding.

        2.6.6 Other Computations.

        2.6.7 Computing in Rooted Trees.

   2.7 Summary.

   2.8 Bibliographical Notes.

   2.9 Exercises, Problems, and Answers.

3. ELECTION.

   3.1 Introduction.

        3.1.1 Impossibility Result.

        3.1.2 Additional Restrictions.

        3.1.3 Solution Strategies.

   3.2 Election in Trees.

   3.3 Election in Rings.

        3.3.1 All the Way.

        3.3.2 As Far As It Can.

        3.3.3 Controlled Distance.

        3.3.4 Electoral Stages.

        3.3.5 Stages with Feedback.

        3.3.6 Alternating Steps.

        3.3.7 Unidirectional Protocols.

        3.3.8 Limits to Improvements.

        3.3.9 Summary and Lessons.

   3.4 Election in Mesh Networks.

        3.4.1 Meshes.

        3.4.2 Tori.

   3.5 Election in Cube Networks.

        3.5.1 Oriented Hypercubes.

        3.5.2 Unoriented Hypercubes.

   3.6 Election in Complete Networks.

        3.6.1 Stages and Territory.

        3.6.2 Surprising Limitation.

        3.6.3 Harvesting the Communication Power.

   3.7 Election in Chordal Rings.

        3.7.1 Chordal Rings.

        3.7.2 Lower Bounds.

   3.8 Universal Election.

        3.8.1 Mega-Merger.

        3.8.2 Analysis of MegaMerger.

        3.8.3 YO-YO.

        3.8.4 Lower Bounds and Equivalences.

   3.9 Bibliographical Notes.

   3.10 Exercises, Problems, and Answers.

4. MESSAGE ROUTING AND SHORTEST PATHS.

   4.1 Introduction.

   4.2 Shortest-Path Routing.

        4.2.1 Gossiping the Network Maps.

        4.2.2 Iterative Construction of Routing Tables.

        4.2.3 Constructing Shortest-Path Spanning Tree.

        4.2.4 Constructing A Sparser Subgraph.

        4.2.5 Min-Hop Routing.

        4.2.6 Suboptimal Solutions: Routing Trees.

   4.3 Coping with Changes.

        4.3.1 Adaptive Routing.

        4.3.2 Fault-Tolerant Tables.

        4.3.3 On Correctness and Guarantees.

   4.4 Routing in Static Systems: Compact Tables.

        4.4.1 The Size of Routing Tables.

        4.4.2 Interval Routing.

   4.5 Bibliographical Notes.

   4.6 Exercises, Problems, and Answers.

5. DISTRIBUTED SET OPERATIONS.

   5.1 Introduction.

   5.2 Distributed Selection.

        5.2.1 Order Statistics.

        5.2.2 Selection in a Small Data Set.

        5.2.3 Simple Case: Selection Among Two Sites.

        5.2.4 General Selection Strategy: RankSelect.

        5.2.5 Reducing the Worst Case: ReduceSelect.

   5.3 Sorting a Distributed Set.

        5.3.1 Distributed Sorting.

        5.3.2 Special Case: Sorting on a Ordered Line.

        5.3.3 Removing the Topological Constraints: Complete Graph.

        5.3.4 Basic Limitations.

        5.3.5 Efficient Sorting: SelectSort.

        5.3.6 Unrestricted Sorting.

   5.4 Distributed Sets Operations.

        5.4.1 Operations on Distributed Sets.

        5.4.2 Local Structure.

        5.4.3 Local Evaluation.

        5.4.4 Global Evaluation.

        5.4.5 Operational Costs.

   5.5 Bibliographical Notes.

   5.6 Exercises, Problems, and Answers.

6. SYNCHRONOUS COMPUTATIONS.

   6.1 Synchronous Distributed Computing.

        6.1.1 Fully Synchronous Systems.

        6.1.2 Clocks and Unit of Time.

        6.1.3 Communication Delays and Size of Messages.

        6.1.4 On the Unique Nature of Synchronous Computations.

        6.1.5 The Cost of Synchronous Protocols.

   6.2 Communicators, Pipeline, and Transformers.

        6.2.1 Two-Party Communication.

        6.2.2 Pipeline.

        6.2.3 Transformers.

   6.3 Min-Finding and Election: Waiting and Guessing.

        6.3.1 Waiting.

        6.3.2 Guessing.

        6.3.3 Double Wait: Integrating Waiting and Guessing.

   6.4 Synchronization Problems: Reset, Unison, and Firing Squad.

        6.4.1 Reset/Wakeup.

        6.4.2 Unison.

        6.4.3 Firing Squad.

   6.5 Bibliographical Notes.

   6.6 Exercises, Problems, and Answers.

7. COMPUTING IN PRESENCE OF FAULTS.

   7.1 Introduction.

        7.1.1 Faults and Failures.

        7.1.2 Modelling Faults.

        7.1.3 Topological Factors.

        7.1.4 Fault-Tolerance, Agreement and Common Knowledge.

   7.2 The Crushing Impact of Failures.

        7.2.1 Node Failures: Single Fault Disaster.

        7.2.2 Consequences of the Single Fault Disaster.

   7.3 Localized Entity Failures: Using Synchrony.

        7.3.1 Synchronous Consensus with Crash Failures.

        7.3.2 Synchronous Consensus with Byzantine Failures.

        7.3.3 Limit to Number of Byzantine Entities for Agreement.

        7.3.4 From Boolean to General Byzantine Agreement.

        7.3.5 Byzantine Agreement in Arbitrary Graphs.

   7.4 Localized Entity Failures: Using Randomization.

        7.4.1 Random Actions and Coin Flips.

        7.4.2 Randomized Asynchronous Consensus: Crash Failures.

        7.4.3 Concluding Remarks.

   7.5 Localized Entity Failures: Using Fault Detection.

        7.5.1 Failure Detectors and Their Properties.

        7.5.2 The Weakest Failure Detector.

   7.6 Localized Entity Failures: Pre-Execution Failures.

        7.6.1 Partial Reliability.

        7.6.2 Example: Election in Complete Network.

   7.7 Localized Link Failures.

        7.7.1 A Tale of Two Synchronous Generals.

        7.7.2 Computing With Faulty Links.

        7.7.3 Concluding Remarks.

        7.7.4 Considerations on Localized Entity Failures.

   7.8 Ubiquitous Faults.

        7.8.1 Communication Faults and Agreement.

        7.8.2 Limits to Number of Ubiquitous Faults for Majority.

        7.8.3 Unanimity in Spite of Ubiquitous Faults.

        7.8.4 Tightness.

   7.9 Bibliographical Notes.

   7.10 Exercises, Problems, and Answers.

8. DETECTING STABLE PROPERTIES4.

   8.1 Introduction.

   8.2 Deadlock Detection.

        8.2.1 Deadlock.

        8.2.2 Detecting Deadlock: Wait-for Graph.

        8.2.3 Single-Request Systems.

        8.2.4 Multiple-Requests Systems.

        8.2.5 Dynamic Wait-For Graphs.

        8.2.6 Other Requests Systems.

   8.3 Global Termination Detection.

        8.3.1 A Simple Solution: Repeated Termination Queries.

        8.3.2 Improved Protocols: Shrink.

        8.3.3 Concluding Remarks.

   8.4 Global Stable Property Detection.

        8.4.1 General Strategy.

        8.4.2 Time Cuts and Consistent Shapshots.

        8.4.3 Computing A Consistent Snapshot.

        8.4.4 Summary: Putting All Together.

   8.5 Bibliographical Notes.

   8.6 Exercises, Problems, and Answers.

9. CONTINUOUS COMPUTATIONS.

   9.1 Introduction.

   9.2 Keeping Virtual Time.

        9.2.1 Virtual Time and Causal Order.

        9.2.2 Causal Order: Counter Clocks.

        9.2.3 Complete Causal Order: Vector Clocks.

        9.2.4 Concluding Remarks.

   9.3 Distributed Mutual Exclusion.

        9.3.1 The Problem.

        9.3.2 A Simple And Efficient Solution.

        9.3.3 Traversing the Network.

        9.3.4 Managing a Distributed Queue.

        9.3.5 Decentralized Permissions.

        9.3.6 Mutual Exclusion in Complete Graphs: Quorum.

        9.3.7 Concluding Remarks.

   9.4 Deadlock: System Detection and Resolution.

        9.4.1 System Detection and Resolution.

        9.4.2 Detection and Resolution in Single-Request Systems.

        9.4.3 Detection and Resolution in Multiple-Requests Systems.

   9.5 Bibliographical Notes.

   9.6 Exercises, Problems, and Answers.