It's impossible to improve what you can't measure. – Peter Drucker
This quote reflects the importance of metrics in every domain right now. Intuitively, this statement resonates with all of us. How do you know if you're actually growing if you don't have the data to understand where you've been and where you're going? This blog post talks about the importance we see of metrics, particularly in continuous delivery.
How do metrics matter in the context of continuous delivery?
What lets you know if you've actually achieved a good state of continuous delivery? To understand the answers to these questions, the GoCD team interviewed consultants, developers, operations teams, and people at every intersection of DevOps. We also talked to business stakeholders, because part of successful continuous delivery is being able to collaborate successfully between business and technology, and being able to communicate the value of some of the underlying work that goes into the implementation of continuous delivery.
There are four metrics that we synthesized out of this that we think are really valuable:
- Number of deploy-ready builds
- Cycle time
- Mean time between failures
- Mean time to recover
How many deploy-ready builds do you have?
For successful continuous delivery, you need routine commits, and specifically routine commits to master. If I'm committing all the time to my own personal branch, I'm not adding value to the code that's actually ready for production.
A good rate of deploy-ready builds also relies on having testing you can trust. One common anti-pattern we heard was the existence of testing that has to run for several hours, or even an entire day to do validation. In many cases these tests were unreliable, meaning that at the end of that period, you're no more certain than you were at the start. That becomes costly because it makes everyone very wary of ever releasing. Deploying software can feel like Russian Roulette.
This metric also emphasizes the importance of collaboration between product and engineering roles. Cross-functional teams must be able to create a roadmap such that at any point stories are broken up small enough that you can release them and deliver real value to users. If the product side isn't engaged in this, teams develop backlogs of large chunks of work that don't add any value until late in the game.
As a complement to careful roadmap planning, the development team should employ patterns like feature toggles, which allow features to be deployed without exposing them to customers.
What is your cycle time?
Long cycle time was the most common pain point we heard from developers. The time from when a commit is made, through testing and validation, to a deployment can be an enormous source of frustration for a developer. As an engineer, waiting for feedback requires disruptive context switching and represents wasted time. A light-hearted, but very real representation of this is a classic XKCD comic about sword fighting while waiting for code to compile.
This dead time doesn’t get you any closer to delivering real value to users, and can create a loss of focus. Improving a team’s cycle time relies on efficient testing, and on getting feedback as quickly as you can. Here are a few practices that can help improve your cycle time.
- Running your unit tests early in your pipeline and your complex, longer running automated tests downstream, will provide essential feedback sooner, and save you time.
- Passing dependencies from pipeline stage to pipeline stage can help avoid unnecessary rebuilding of artifacts, which can be really valuable.
- Parallelizing your builds when possible also provides significant savings.
- Lastly, make sure you've got the right build resources so that whatever builds you need to run, you have enough agents to do the job.
What’s your mean time between failures and mean time to recover?
Mean time between failures and mean time to recover often go hand in hand because it is important to balance the two of them. Mean time between failures reminds the team to keep the build green whenever possible, and to avoid easy failures. However, only looking at mean time between failures and trying to avoid failure completely can result in teams becoming overly cautious and never releasing anything new. The core point of software development is to provide new value to users and make sure that we're serving their needs. Thus, a focus on mean time to recover – a metric that represents the ability to bounce back from a misstep – is a key counterbalance.
Achieving a good mean time between failures relies on getting feedback early on and making sure that thorough validation occurs in testing environments. These validations should be run on production-like environments with realistic data. Strong local builds are also crucial here.
Since failure is inevitable, it’s important to make sure that your mean time to recover is as quick as possible. How long does it take to get you back to a green build after you've had a pipeline failure, or after you've had a release that's failed? Robust monitoring of production is essential. Teams should learn about failures through your monitoring and alerts, not through customer complaints.
Drilling key practices like rolling back can also improve mean time to recover. Having an automated rollback process can buy you a little bit of time to understand where the issue occurs. Diagnosing the cause of an issue quickly also relies on informative logging that enables developers to pinpoint a problem when they’ve been paged at 2:00am.
Summary
Going back to Peter Drucker's quote, to improve anything, first you need to find a way to measure it and make it visible. This is why, having a dashboard and making metrics visible to the team gives them a sense of ownership and a sense of connection that is really valuable. On the other hand, I don't want to say that metrics are a panacea. There are definitely some meaningless metrics and vanity metrics out there. Ultimately, you want to incentivize people to look at hard problems, and where they can create meaning for the team or organization.