SRE Classroom: Learning about NALSD and SRE

Introduction

SRE Classroom is a collection of workshops developed by Google's Site Reliability Engineering group. The goals of this workshop are to (1) introduce participants to the principles of non-abstract large systems design (NALSD), and (2) provide hands-on experiences with applying these principles to the design and evaluation of these systems. We consider NALSD a concept fundamental to SRE, and understanding its principles provides a basis for having meaningful conversations about the design and operation of large software systems.

Tutorials

distributed-pubsub

Build a planet scale distributed PubSub system using NALSD principles. Learn about some foundational large system design principles and concepts. Topics include correctness, reliability, performance, different inter-system communication styles, and more. We introduce the problem requirements in detail and walk through an example solution.

imageserver

Build a planet scale distributed ImageServer system using NALSD principles. Learn about some foundational large system design principles and concepts. Topics include sharding, replication, latency, load balancing, and more. We introduce the problem requirements in detail and walk through an example solution.

Additional Resources

This section collects material that you can use to continue your study of Non Abstract Large System Design. You can use this material independently of the tutorial material.

If you find this useful, tell us which topics you want to see in future exercises. Please use the issue tracker to send us your thoughts and suggestions. Alternatively, send us a tweet at @googlesre.

Licensing

These materials are released under the Creative Commons CC-BY-4.0 license for anyone to use and reuse, as long as Google is credited as the original author. If you want to suggest improvements, have any problems with the content, or just want to ask a question, please create a bug in our issue tracker component.