Third & GroveThird & Grove
Mar 17, 2020 - Justin Emond

When to do Production Releases

man in chair on comp

Pretty much everything I know about software development I learned from reading Joel Spoksly posts on Joel on Software. For the last decade or so it has mostly been updates on his career, but, prior to that, there was a golden age of several years where he put out incredibly insightful pieces about everything software development. If you are an engineer or an engineering leader, it is required reading.

I cannot for the life of me find the post but I remember learning my first lesson about how to do production releases from Joel on Software many, many years ago. FogCreek (his company at the time) experienced a failed release on a Sunday that was painfully hard to recover from because most of the team that would typically have been able to jump in and help were out of reach. It was Sunday after all. After that release they decided to schedule all of their production releases at 10 AM on a workday when everyone was in the office, caffeinated, rested, and online.

We view it as our job to be zealous advocates for our client’s technology. That is why, in general, while it is counterintuitive, for the reasons minimizing disruption to the business, we recommend to our clients to conduct releases during business hours.

But in order for business hours to be the best release window there are several prerequisites about your situation that must be met:

  • Your release does not require any downtime to be rolled out
  • Your release is reversible 
  • You use continuous integration to power your releases
  • The scope of the release is generally narrow
  • You have an edge cache in place

Why is it crucial for releases to be reversible? Because in engineering it is often a good idea to design systems that fail well. The difference between a great engineer and a researcher is that great engineers accept that failure is a part of the system. What about “failing forward”? Well, as long as you can fail forward fast then you have achieved the same outcome as a release being reversible: resolving the issue rapidly.

If your release isn’t reversible you have broader issues to contend with, and likely need to come up with a new strategy to perform your release in smaller, more discrete steps one at a time, over time.

An edge cache is a must-have for a business hours release because they offer an additional layer of protection if there is a release issue (as long as you flush the cache last).

There are, of course, exceptions to business hours releases. Most critically, if your release must include downtime you have to release during a quiet traffic window. You have to plan (including your staff) a thoughtful approach when it comes to after hour upgrades. Thus, ensuring you have a clear path of escalating to more senior engineers.