Fall 2024-CS 539-ECE 526-Distributed Algorithms

Distributed algorithms are algorithms designed to run on multiple processes or mutually distrusting participants, often to increase performance through parallel processing or improve resilience by tolerating faults. Without centralized control, distributed algorithms typically require delicate coordination among processes. The concurrency, non-determinism, and potential faults make distributed algorithms hard to design and analyze compared to single-process algorithms. 

This course covers foundations and recent advances in distributed algorithms. The course studies fundamental problems in distributed computing, including graph algorithms, leader election, synchronization, consensus, broadcast, state machine replication, distributed shared memory, mutual exclusion, shared objects, and distributed cryptography. The course studies foundational models and assumptions regarding shared memory vs. message passing, synchrony vs. asynchrony, reliability of message delivery, crashes and Byzantine faults. The course introduces common algorithm design techniques, including synchronization, cryptographic primitives, leader election, randomization, optimistic cases, simulation, and quorums. 

This is a theory/algorithm course that emphasizes formal modeling, rigorous proofs, theoretical analysis, and fundamental limits in the form of lower bounds and impossibility results. While the course also touches on practical aspects whenever applicable, it will not be a priority, and many topics covered in the course will be of theoretical interest primarily at present.

Prerequisite: None required. However, a course on algorithms (like CS/ECE 374) OR distributed systems (like CS 425 / ECE 428) is strongly recommended. 

Distributed Algorithms (CS539) vs. Distributed Systems (CS425): If you are wondering how to choose between these two classes or if you should take both, here is some information for you. CS425 covers basic concepts ranging from algorithmics to systems design principles that underlie distributed systems, especially those deployed in the industry. CS539 focuses on the theoretical underpinnings of distributed algorithms with an emphasis on their formal modeling, rigorous proofs, and fundamental limits. Neither course is a prerequisite for the other. The courses have a small overlap for the sake of continuity so that you can (if you wish) take both courses in either order and still benefit from each course. Take CS425 first if you want an intro course to distributed systems. Take CS539 first if you love doing algorithmics and proofs. Both courses are intended to be accessible to undergrads (with appropriate prerequisites) as well as grad students at all levels.

Grading: five to six problem sets 60%; midterm 15%, final exams 20%; class participation 5%.    

Resources: No required textbook. The following textbooks and online resources may be helpful.

Lectures: Monday and Wednesday 11 am -- 12:15 pm in 0222 Siebel Center for Computer Science.

Office Hours: Monday 10 - 11 am in 4312 Siebel Center AND after each class in the lecture room.

Tentative Schedule: (subject to change)

Date Topic Suggested Reading
08/26 Introduction
08/28 Leader Election in a Ring An Improved Algorithm for Decentralized Extrema-Finding in Circular Configurations of Processes 
09/02 No Class (Labor Day)
09/04 No Class (Instructor Away)
09/09 Basic Graph Algorithms
09/11 Clock Synchronization
09/16 Synchronizers Complexity of Network SynchronizationLinks to an external site (Sections 1-3)
09/18 Causality, Logical Clocks, and Global States Time, Clocks and Ordring of Events in a Distributed System
09/23 Common Knowledge Knowledge and common knowledge in a distributed environment
09/25 Introduction to Consensus The Byzantine Generals Problem (Sections 1-3)
09/30 Time Bounds of Consensus A Simple Bivalency Proof that t-Resilient Consensus Requires t+1 Rounds
10/02 FLP Impossibility Impossibility of Distributed Consensus with One Faulty Process    
10/07 Randomized Agreement Another advantage of free choice        Modern Ben-Or
10/09 Partial Synchrony and Paxos Partial synchrony     The Part-Time Parliament      Paxos Made Simple
10/14 No Class (extra office hour)  
10/16 Midterm  
10/21 Distributed Shared Memory Sharing memory robustly in message-passing systems (Sections 1-4)
10/23 Shared Registers  
11/28 Mutual Exclusion Peterson's Mutual Exclusion Algorithm
11/30 Mutual Exclusion
Lamport's Bakery Algorithm
11/04 Mutual Exclusion Mellor-Crummey and Scott Mutual Exclusion Algorithm
11/06 Wait-free Hierarchy Wait-free synchronization
11/11 Byzantine Broadcast Dolev-Strong Authenticated Broadcast   
11/13 Reliable Broadcast Bracha's Reliable Broadcast
11/18 PBFT Practical Byzantine Fault Tolerance (Sections 1-4)
11/20 Fault Tolerance Bounds
11/25 No Class (Fall Break)
11/27 No Class (Fall Break)
12/02 Nakamoto Consensus Bitcoin: A Peer-to-Peer Electronic Cash System
12/04 Nakamoto Consensus Bitcoin's Latency--Security Analysis Made Simple
12/09 No Class (extra office hour)
12/11 Final Exam