- Danny Mican
I'm excited to announce Site Reliability Engineering Tidbits!
This book is a collection of 28 chapters on #sre concepts, such as observability, monitoring, Service Level Objectives (#slos), alerting, resilience and debugging.
Most of the chapters detailed concepts that I applied while working in a professional setting so they are 100% proven in real business!
This book aims to provide hands on examples of implementing a number of concepts described in Google's SRE books. It also describes how i've seen SRE concepts impact some of the organizations I've worked in.
A couple chapters are hands on debugging exercises going through the process of debugging applications based on data and the scientific method. I'm excited because every chapter describes something that I've done in paying jobs, so it documents real life SRE in action at various size organizations, and not theoretical SRE concepts.
I adapted each chapter from blog posts I've written over the last 4 years, and would be happy to share any information about the process or content.