ABAW Challenge #2 – Site Reliability Engineering: How google runs production systems

SITE RELIABILITY ENG

It is the second week of ABAW Challenge and the book I picked for this week was Site Reliability Engineering: How Google Runs Production Systems.

I would like to encourage people working on DevOps, Server Administration, DBA and System/Software Architecture and similar roles to read this book.

Probably nobody knows how to run a production system better than the google team. I live in a country where most of us type http://www.google.com on the web browser to check whether the internet is working or not. We trust the availability of the google.com website more than the availability of the internet data connection. Even if http://www.google.com is down, we still believe it is an internet problem, because we trust google to be always up and running.

This book is written by engineers (actually dozens of those engineers) who run the google production systems, the team that is responsible for the availability and performance of google products. This team is called SRE (Site Reliability Engineering Team) within Google.

The main attraction for buying this book and spending a week reading, is the fact that this is written by those engineers who are running my favorite google products. This was part of the efforts to listen to them and to understand their vision, approach, thinking and the way of working. I went over the whole book with full attention and focus to see what I can adopt from what those SRE engineers do within Google.

site reliability engineering

I bought a Kindle version of the book and that is what I read this week. I thought this is going to be a good item to read for my DevOps team, and therefore I bought a printed copy of this book as well.

Overall, I liked this book very much. A lot of the tools, systems and environments described in this book exist only within Google and therefore they did not help much directly. However, this book helped me to understand how those engineers work, create and track SLOs, handling outages, processes and methodologies in place etc. Moving forward, I am going to encourage my team to put more focus on the postmortem reports after every outage, organize and manage those reports and use them as a reference point for training as well as future fixes.

I would like to encourage people working on DevOps, Server Administration, DBA and System/Software Architecture and similar roles to read this book. I would also recommend the DevOps/SRE engineers at Amazon read this as well, because their login page is down for the last 15 minutes and I can’t look at my wish list I compiled there!

amazon login

Next Week

Now that I am feeling more confident about being able to continue this challenge, I am trying to pick the book for next week. The following are on the top of my list and I will pick one of them.

If any of you have read one or more of the above books, I would love to hear your feedback and will make my choice based on that 🙂

ABAW Challenge – The Second Machine Age: Work, Progress and Prosperity in a Time of Brilliant Technologies

Last week I started a challenge, which I call ABAW Challenge (A book a Week) to motivate myself to read a book every week. It was a serious challenge for me, because my hectic work schedule left me little time to focus on anything else. Interestingly, it is the difficulty level which inspired me to go ahead and attempt this almost impossible mission.

The book I picked last week was The Second Machine Age written by Erik Brynjolfsson and Andrew McAfee.

second machine age

The Second Machine Age: Work, Progress, and Prosperity in a Time of Brilliant Technologies

“The Second Machine Age” is a New York Times, Wall Street Journal and Washington Post Bestseller!

I decided to buy an Audible version of the book so that I can listen to it. This allowed me to efficiently use my time for this project, when a normal reading was naturally not possible: such as when driving. The audio recording of the book was 8 hours and 50 minutes and I was able to complete it within a week.

The audible version of the book was narrated by Jeff Cummings and I must say that I loved narration.

Even thought the whole exercise looked like a challenge when I started with this, the journey quickly became very enjoyable and exciting. I did not want to write about this until I was sure that I can continue this exercise for quite some time.

I am a great fan of the industrial revolution, which the authors of the book call ‘First Machine Age’. Years ago, I had read several books on industrial revolution and its history, progress and explosion, watched many movies, documentaries and videos; including the famous Charlie Chaplin Movie Modern Times. Some of the remarks in this book about the first machine age reminded me about all those and it indeed multiplied the fun!

This book is a very interesting read, not only for technology professionals, but also for anyone having an interest in science, technology and computers.