Course Syllabus

Distributed algorithms are algorithms designed to run on multiple processes or mutually distrusting participants, often to increase performance through parallel processing or improve resilience by tolerating faults. Despite not having centralized control, distributed algorithms typically require delicate coordination among processes. The concurrency, non-determinism, and potential faults make distributed algorithms hard to design and analyze compared to single-process algorithms. 

This course covers foundations and recent advances in distributed algorithms. The course studies important problems in distributed computing, including graph algorithms, synchronization, consensus, broadcast, state machine replication, distributed shared memory, mutual exclusion, shared objects, and threshold cryptography. The course studies foundational models and assumptions regarding shared memory vs. message passing, synchrony vs. asynchrony, reliability of message delivery, crash and Byzantine faults. The course introduces common algorithm design techniques, including synchronization, cryptographic primitives, leader election, randomization, optimistic case, simulation, and quorums. 

This is a theory/algorithm course that emphasizes formal modeling, rigorous proofs, theoretical analysis, and fundamental limits in the form of lower bounds and impossibility results. While the course also touches on practical aspects whenever applicable, it will not be a priority of the course, and many topics covered in the course will be of theoretical interest primarily at present.

Prerequisite: None required. But a course on algorithms (like CS/ECE 374 or CS 473) OR distributed systems (like CS 425 / ECE 428) is strongly recommended. 

Distributed Algorithms (CS539) vs. Distributed Systems (CS425): If you are wondering how to choose between these two classes or if you should take both, here is some information for you. CS425 covers basic concepts ranging from algorithmics to systems design principles that underlie distributed systems, especially those deployed in the industry. CS539 focuses on the theoretical underpinnings of distributed algorithms with an emphasis on their formal modeling, rigorous proofs, and fundamental limits. Neither course is a prerequisite for the other. The courses have a small overlap for the sake of continuity so that you can (if you wish) take both courses in either order and still benefit from each course. Take CS425 first if you want an intro course to distributed systems. Take CS539 first if you love doing algorithmics and proofs. Both courses are intended to be accessible to undergrads (with appropriate prerequisites) as well as grad students at all levels.

Grading: five to six problem sets 60%; midterm 15%, final exams 20%; class participation 5%.    

Resources: No required textbook. The following textbooks and online resources may be helpful.

Lectures: Monday and Wednesday 9:30 -- 10:45 am in 114 Transportation Building.

Lecture recordings will be available shortly after each class on our course channel on Mediaspace. But in-person attendance to lectures is recommended. 

Office Hours: Monday 2 - 3 pm in 4312 Siebel Center AND after each class in the lecture room.

Tentative Schedule: (subject to change)

Date Topic Suggested Reading
01/18 Introduction and Models
01/23 Basic Graph Algorithms
01/25 Clock Synchronization
01/30 Synchronizers Complexity of Network Synchronization (Sections 1-3)
02/01 Causality and Logical Clocks Time, Clocks and Ordring of Events in a Distributed System
02/06 Introduction to Consensus The Byzantine Generals Problem (Sections 1-3)
02/08 Dolev-Strong Broadcast Dolev-Strong Authenticated Broadcast
02/13 Complexity Bounds of Consensus A Simple Bivalency Proof that t-Resilient Consensus Requires t+1 Rounds
02/15 Fault Bounds in Synchrony
02/20 Fault Bounds in Asynchrony Impossibility of Distributed Consensus with One Faulty Process
02/22 Asynchronous Solvability Bracha's Reliable Broadcast
02/27 Randomized Agreement Another advantage of free choice    Modern version of Ben-Or
03/01 Paxos The Part-Time Parliament    Paxos Made Simple
03/06 Midterm review
03/08 Midterm
03/13 Spring Break
03/15 Spring Break
03/20 PBFT Practical Byzantine Fault Tolerance (Sections 1-4)
03/22 Distributed Shared Memory Sharing memory robustly in message-passing systems (Sections 1-4)
03/27 Shared Registers
03/29 Mutual Exclusion Peterson Lock    
04/03 Mutual Exclusion Bakery Algorithm
04/05 Wait-free Hierarchy Wait-free synchronization
04/10 Cancelled
04/12 Secret Sharing and Error Correction How to share a secret
04/17 Threshold Cryptography Fast and Scalable BLS Threshold Signatures
04/19 Multi-party Computation
04/24 Nakamoto Consensus Bitcoin: A Peer-to-Peer Electronic Cash System
04/26 Nakamoto Consensus Bitcoin's Latency--Security Analysis Made Simple
05/01 Final Review
05/03 Final