In Part 1 of our two-part “You Can’t Measure What You Can’t See: Driving Outcomes through Value Stream Maps” series, we covered how to create a value stream map and took a high-level look at the four metrics to track at each stage of your value stream. In this installment, we’re going to dig into how you can turn this data into insights to increase your DevOps maturity.
Innovation speed is the number one reason companies adopt DevOps. However, the Software Delivery Performance Matrix shows that speed isn’t the only benefit. In fact, the highest performing DevOps teams balance speed with quality, increasing not only how fast they work, but also how well they work.
The question becomes, “how do you optimize speed and quality in parallel?” And luckily, value stream maps give you the data you need to address both. The strategies detailed below will help you to get started on increasing both speed and quality in your development lifecycle. Remember, it's very important to focus on one area of improvement at a time. If you focus on everything, the focus is on nothing. While elite organizations have high performance across the board, they got there by making step-wise improvements over time. Treat your journey in the same way; marginal improvement in one area will ultimately have outsized impact on overall business outcomes.
Side note: If you haven’t mapped out your value stream, we recommend starting your optimization journey here—understanding your value stream is the essential first step to realizing speed and quality performance gains. Everything is theory until you have the data—without it, you can’t accurately identify inefficiencies, reduce waste, or drive improvement.
Lead time—the time it takes to release a feature to production after development is complete— is the first indicator of speed. Remember, your team isn’t only delivering work, you’re delivering capabilities that increase business value, so the longer you delay releasing your work, the longer it takes the business to realize value.
Understanding your current lead time is step 1. Step 2 is benchmarking it against industry peers, your team’s previous performance, or even other development teams in your organization. These comparative metrics give you a baseline for where your team stands and also help you identify what lead time you should be able to achieve. The top 13% of Salesforce organizations have a lead time of less than one hour, while high performing teams have a lead time of less than one day. How do they do it?
While there are many factors, two strategies for reducing lead time may include:
Deployment frequency is the second indicator of speed, as it measures how often a team releases work to production. Higher performing organizations are far more likely to integrate developers’ changes on an ongoing basis, and when those changes are integrated more frequently, they are likely to be less complex and less costly.
However, if teams just began moving faster and merging code into production more often, quality could plummet. Instead, increasing deployment frequency is best accomplished with guardrails provided by DevOps methodologies and tools.
Strategies that can contribute to increased deployment frequency include:
You might notice that teams who deploy more frequently tend to see lower risk and faster time to value. That’s because while each of these strategies can contribute to increased deployment frequency, they may also help to ensure increased quality at the same time. Building higher levels of security and compliance into your DevOps processes ultimately enables a faster, more reliable, and more consistent value flow from IT to end user.
Ideally, errors would never make it to production to disrupt any business process or workflow. But, as this is real life and every part of the development life cycle requires human input at one point or another (either doing the work or setting up the automated systems and tests), errors do happen. The first quality indicator—change failure rate—measures the percentage of production releases that cause deployment errors which result in business disruptions. This measurement helps illustrate to delivery leaders how their work affects the business.
Decreasing change failure rate and maintaining a low number is critical for the business—not just in terms of value creation or loss in the moment, but for overall end user trust and business continuity. Customer exposure to bugs and any amount of downtime due to breaking changes can be incredibly costly.
That said, lowering this number can be complex because it’s tied to factors across people, process, and product. Let’s take a look at a few methods for addressing each:
Change failure rate needs to be as close to zero as possible. Decreasing it mitigates risk and is essential for maintaining business value over time.
When errors are deployed, how long does it take your team to troubleshoot, fix, or rollback those changes? The answer to that question is your mean time to recovery. You can think of change failure rate and mean time to recovery as a pair. Failures should happen as infrequently as possible, but when they do happen, your team should be able to fix them quickly.
A short mean time to recovery is essential for many of the reasons discussed above—not only does downtime potentially erode customer trust, but it also contributes to low productivity and even lost revenue when systems are unreliable. Clearly, your organization wants to avoid all of these outcomes.
Here are a few practices to think about in order to decrease mean time to recovery:
Of course, speed and quality are intertwined. If your team has reduced lead time and is deploying frequently, the odds that you can also reduce your mean time to recovery are greater because it’s likely that the code developers will need to assess is less complex and can therefore be fixed faster.
Teams who regularly monitor development metrics perform better, but more effective performance isn’t based on monitoring metrics alone. The crucial step is turning insight into action and implementing strategies that help your team improve on areas of weakness and scale areas of strength.
There are many ways to approach a problem and optimize a team’s performance, but realizing performance gains in all areas is more likely to occur when a comprehensive CI/CD tool is implemented and common DevOps practices are followed.
Speed at the price of quality is no longer a tradeoff. With a strong DevOps culture and value stream mapping to bring visibility to your processes and their effectiveness, there is a clear confirmation that it is possible to “optimize for stability without sacrificing speed.”