OFFRE LISEUSES

Une liseuse achetée = une housse offerte* jusqu'au 21 juin

Marcel Koert

Dernière sortie

Essential SRE : It’s not the tools, it’s Us

Reliability does not fail only because of bad architecture, broken pipelines, weak monitoring, or missing automation. Sometimes reliability fails because humans are tired, biased, afraid, overloaded, overconfident, badly incentivised, or simply trying to survive another incident call. In Essential SRE: It's Not the Tools, It's Us, Marcel Koert takes an opinionated, practical, and deeply human look at Site Reliability Engineering.
This is not another book pretending that Kubernetes, observability dashboards, SLOs, incident tooling, or platform engineering will magically fix unreliable organisations. They will not. Tools matter, but people decide how those tools are used, ignored, misunderstood, bypassed, defended, or blamed. This book explores the uncomfortable human side of reliability: cognitive bias, psychological safety, burnout, alert fatigue, hero culture, handoffs, sleep deprivation, communication under pressure, rule-bending, incentives, stress, and the dangerous myth of "human error." You will read about why incident rooms anchor too quickly on the first metric they see, why senior engineers can become overconfident failure amplifiers, why alert fatigue makes smart people operationally deaf, why hero culture creates chronic fragility, and why a tired engineer with production access can become a bigger reliability risk than a missing readiness probe. Written for SREs, DevOps engineers, platform engineers, engineering managers, incident commanders, and technology leaders, this book challenges the idea that reliability is mostly a tooling problem. It is not. Reliability is a socio-technical system.
The technical part matters. The human part decides whether it survives contact with production. Inside this book, you will explore:  Why humans are often the original single point of failure How cognitive biases distort incident response Why psychological safety is an uptime feature How sleep, stress, burnout, and alert fatigue damage reliability Why handoffs, meetings, Slack, and time zones create hidden operational risk How incentives shape outages long before the pager goes off Why hero culture feels useful but creates long-term fragility How to move from blaming "human error" to designing for human performance  This is an opinionated book.
You may agree with it. You may disagree with it. You may even get annoyed by parts of it. Good. Reliability deserves better conversations than tool comparisons and maturity models. If this book makes you rethink how your organisation treats people during incidents, designs operational work, rewards heroics, or talks about failure, then it has done its job. Because in SRE, it is not only the tools.
It is us. 
Offrir maintenant
Ou planifier dans votre panier

Les livres de Marcel Koert