Although the process of releasing software varies considerably from application to application, we would probably all agree there are common reasons releases tend to fail. Lack of planning and preparation, defective code, infrastructure and release tooling failures, and poor communication, all transcend technology choices and industries.
Indeed, no one can guarantee a release’s success each and every time. However, there are steps we can take to improve the odds of executing consistently successful software releases. Based on my experience with a diverse set of enterprise software organizations, I’ve collected following suggestions for improving release success.
Manage the Release, Don’t Let the Release Manage You
Engage an experienced Release Manager, someone who schedules the release, coordinates the resources, runs the release call, leads the release team through the plan, and communicates status to stakeholders. The Release Manager serves as a single point of contact (aka SPOC) for the release. Having a Release manager allows other release team members to focus on their specific tasks without needless distraction.
A Goal Without a Plan is Just a Wish
Always have an implementation plan (aka IPlan). Whether a single page of paper, or a hundred-page Microsoft Word document, script the entire release in advance. What are the required tasks? What is their order of execution? What resources are required to complete each task? What is the test plan? What is the contingency plan, if or when something goes awry?
Don’t Boil the Ocean
Releases should have specific goals, the fewer, the better. Release goals should be clearly stated at the beginning of the implementation plan. If you feel you need to compress several goals into a single release, you are probably not releasing frequently enough.
For simplicity, separate software-centric releases from infrastructure-centric changes. Don’t tie the success of either, to the success or failure of the other.
Expect Success, Plan for Failure
Channel your inner Boy Scout and always be prepared. Have a contingency plan. Plan for multiple failure scenarios. Individual task failures only lead to release failures when we don’t have a plan to correct for the individual failures quickly. Are additional resources required in the event of a failure? If so, they should be available, and aware of the release plan and the contingency plan, in advance.
Seek the Understanding and Approval of Others
This is one of those times when seeking the understanding and the approval of others is a good thing. Review the plan in advance with all required resources and stakeholders. Ensure there is a complete comprehension of the plan, goals, resource requirements, and timing. Discuss the plan’s level of risk to the organization and their customers. Seek the support and approval of stakeholders when required.
The More I Practice the Luckier I Get
Do a dry run of the plan, even if it is just a verbal walkthrough with the release team. Better yet, do a live run in a production-like staging environment. Adjust the plan if necessary. Also, don’t forget to practice your contingency plan.
Should I Pack a Lunch?
Part of the plan′s review should be a discussion of timing. Release resources should know the estimated total time for the release (aka release window) — best case and not-so-best case. Also, if individual stages are expected to take more than a few minutes to complete, those times should be understood in advance. There is nothing worse than staring endlessly at the third stage of a ten stage continuous integration pipeline for 15 minutes, with no idea of how much long the task is expected to take.
Don’t Start the Game without the Whole Team on the Field
Never start a release without all the required resources present and prepared. A quick release can turn into a long and painful experience if you are waiting on resources. In my experience, the longer a release takes to complete, the greater the risk of failure.
Make a Plan and Stick to It
You wrote the plan, you reviewed the plan, you practiced the plan, and you received the approval of your stakeholders for the plan. So, follow the plan.
If you absolutely must deviate from the plan, take the time to consider the impact and potential risks. Document the deviation for the post-mortem and for planning the next release.
Test Early, Test Often
Testing should not be the final step to confirm the release’s success. Testing should be done continuously and automatically, throughout the release process. Test after each significant stage of the plan. It’s easier to find and correct issues the early they are discovered.
Mute is Your Friend, Always be on Mute
Emotions will flare, words will uncontrollably leap from your lips on occasion, background noise makes following the release call difficult to follow, and diagnosing release issues isn’t best done in front of an audience. Keep the release call focused on the plan and free of emotion.
Are We There Yet?
Consistently communicate before, during, and after the release. Let stakeholders know in advance when the release is starting and how long it is expected to take. Keep stakeholders aware of significant deviations to the plan, especially with customer-facing impact. Let everyone know when the release finishes, and a final release status. Keep communications germane.
Ensure sure everyone understands how the release status will be communicated, be it email, IM, persistent chat, or a web-based status page.
Mistakes are Meant for Learning, not Repeating
Successful or otherwise, follow each release with a blameless post-mortem. A post-mortem might only require a quick five-minute chat. Or, the whole release team might need an hour with a therapist (that’s a joke). Discuss what went right, and what did not go as well as expected. Be keen to focus on repeat problems and problems that were not caught during the release. Most importantly, consider how to continuously improve the release process.
Releasing is a Game, Keep Score
Know your release’s Key Performance Indicators (KPIs) and your Service Level Agreements (SLAs). Understand what matters most to your stakeholders and to your customers. Always know how you are performing against your KPIs and SLAs, and what you need to do to improve them. Dashboards are great tools to display KPI and SLA performance transparently.
Is your failure rate increasing or decreasing? Is one type of task responsible for a majority of your release failures? Is your release taking longer or shorter to complete each time? Did you experience an unexpected outage during the release? For how long? What is the volume of post-release issues caused by, but not discovered during the release?
All opinions in this post are my own and not necessarily the views of my current employer or their clients.