The SRE resource library

SRE represents a mindset, engineering practices, and a job function. Here you will find articles, videos, and guides to help you implement SRE principles and run reliable production systems.

Explore All Resources

Machine learning

Start your journey by exploring

Machine Learning in Production

Continue your journey by reading

Efficient Machine Learning Inference

Extend your journey by watching

Machine Learning at Scale

Service level objectives

Begin by reading

Implementing SLOs

Dig deeper by exploring

Alerting on SLOs

Build your skills with

Art of SLOs

Systems engineering

Learn the basics by reading

Introducing Non-Abstract Large System Design

Develop fundamentals by exploring

SRE Classroom: Distribued ImageServer

Build advanced skills with this video workshop

How to Design a Distributed System

Explore resources

Filter by:

Sorry, no available at the moment.