SRE Classroom: Distributed PubSub is a workshop developed by Google’s Site Reliability Engineering group. The goals of this workshop are to (1) introduce participants to the principles of non-abstract large systems design (NALSD), and (2) provide hands-on experiences with applying these principles to the design and evaluation of these systems. We consider NALSD a concept fundamental to SRE, and understanding its principles provides a basis for having meaningful conversations about the design and operation of large software systems.
In the first theoretical part of the workshop, participants learn about some foundational large system design principles and concepts. Topics include correctness, reliability, performance, different inter-system communication styles, and more. We introduce the problem requirements in detail and walk through the first parts of an example solution.
The practical part of this workshop asks participants to apply the principles they have learned to develop a Publish-Subscribe system that meets certain performance and correctness requirements and Service Level Objectives (SLOs).
The workshop concludes with a detailed example solution, as well as a discussion of the system’s inputs and SLOs.
This workshop includes technical content, and its primary audience is software developers and site reliability engineers. We have also welcomed folks in various other roles, including product management and senior engineering management, to this workshop.
The workshop includes hands-on work well-suited for groups of five, and scales well from 1 to 20 groups—as many as a hundred participants!
We aim to develop durable SRE Classroom materials for folks learning about NALSD. If you find this useful, tell us what you want to see in future exercises. Please use the issue tracker to send us your thoughts and suggestions. Alternatively, send us a tweet at @googlesre.
The workshop documents above are released under the Creative Commons CC-BY-4.0 license for anyone to use and reuse, as long as Google is credited as the original author. If you want to suggest improvements, have any problems with the content, or just want to ask a question, please create a bug in our issue tracker component.