These realities are characterized by the presence of
of networked entities communicating with each other,
cooperating towards common tasks or the solution of
a shared problem, acting autonomously and spontaneously.
They are distributed computing environments.
It has been from the fields of network and of communication engineering that the seeds of what we now experience have germinated. The growth in understanding has occurred when computer scientists (initially very few) started to become aware of and study the computational issues connected with these new network-centric realities. The internet, the web, the grids are just examples of these environments. Whether over wired or wireless media, whether by static or nomadic code, computing in such environments is inherently decentralized and distributed. (Incredibly, the terms ``distributed systems" and ``distributed computing" have been for years highjacked and (ab)used to describe very limited systems and low-level solutions (e.g. client-server) that have little to do with distributed computing.) To compute in distributed environments one must understand the basic principles, the fundamental properties, the available tools, the inherent limitations.
This universe consists of a finite collection of computational entities communicating by means of messages in order to achieve a common goal; e.g., to perform a given task, to compute the solution to a problem, to satisfy a request either from the user (i.e., outside the environment) or from other entities. Although each entity is capable of performing computations, it is the collection of all these entities that together will solve the problem or ensure that the task is performed. In this universe, to solve a problem, we must discover and design a distributed algorithm or protocol for those entities: a set of rules that specify what each entity has to do. The collective but autonomous execution of those rules, possibly without any supervision or synchronization, must enable the entities to perform the desired task, to solve the problem. In the design process, we must ensure both correctness (i.e., the protocol we design indeed solves the problem) and efficiency (i.e., the protocol we design has a ``small" cost).
There are several levels of use of the book. The book is primarily a senior-undergraduate and graduate textbook; it contains the material for two one-term courses or alternatively a full-year course on Distributed Algorithms and Protocols, or Distributed Computing, or Network Computing, or Special Topics in Algorithms. It covers the ``distributed part" of a graduate course on Parallel and Distributed Computing (the Chapters on Distributed Data, Routing, and Synchronous Computing, in particular), and it is the theoretical companion book for a course in Distributed Systems, or Advanced Operating Systems or Distributed Data Processing.
The book is written for the students from the students' point of view, and it follows closely a well defined teaching path and method (the ``course") developed over the years; both the path and the method become apparent while reading and using the book. It also provides a self-contained self-directed guide for system-protocol designers, for communication software engineers and developers, as well as for researchers wanting to enter or just interested in the area; it enables hands-on head-on in-depth acquisition of the material. In addition, it is a source-book and reference-book for investigators in distributed computing and related areas.
All protocols in the textbook as well as those designed by the students as part of the exercises are immediately programmable. (An open source Java-based engine, DisJ, provides the execution and visualization environment for our reactive protocols.) Hence, the subtleties of actual implementation can be employed to enhance understanding of the theoretical design principles; furthermore, experimental analysis (e.g., performance evaluation and comparison) can be easily and usefully integrated in the coursework expanding the analytical tools.
The book is written so to require no prerequisites other than standard undergraduate knowledge of operating systems and of algorithms. Clearly, concurrent or prior knowledge of communication networks, or distributed operating systems, or distributed transaction systems, would help the reader to ground the material of this course into some practical application context; however, none is necessary.
The prerequisite structure reccomends the first three chapters to be covered sequentially and before the other material. There are only two other prerequisite relationships. The relationship between Synchronous Computing (Chapter 6) and Fault-Tolerant Computing (Chapter 7) is particular: The recommended sequencing is in fact the following: Sections 7.1-7.2 (providing the strong motivation for synchronous computing), Chapter 6 (describing fault-free synchronous computing), the rest of Chapter 7 (dealing with fault-tolerant synchronous computing as well as other issues). The other suggested prerequisite structure is that the topic of Stable Properties (Chapter 8) be handled before that of Continuous Computations (Chapter 9). Other than that, the sections can be mixed and matched depending on the instructor's preferences and interests. An interesting and popular sequence for a one-semester course is given by Chapters 1-6. A more conventional one-semester sequence is provided by Chapters 1-3, 6-9.
The symbol * after a Section indicates non-core material. In connection with Exercises and Problems the symbol * denotes instead difficulty (the more symbols, the greater the difficulty).
Several important topics are not included in this edition of the book. In particular, the book does not include right now algorithms on distributed coloring, on minimal independent sets, on self-stabilization, as well as on Sense of Direction. By design, this book does not include distributed computing in the shared memory model, focusing entirely on the message-passing paradigm.
distributed algorithms are fun.
I welcome you to share this experience and I hope you will reach the same conclusion.