Speed or Quality? What is more important? Lessons from Japan.

Speed or Quality? What is more important? Lessons from Japan.

We Don’t Do DevOps.

In most of my encounters with new customers I take the time to explain that I don’t “do DevOps”.

Yes, DevOps a convenient name tag. It provides an easy to present packaging that has been feeding me and my colleagues well for the last 5+ years.

On the other hand – it definitely looks like the original meaning of the word (as originally coined by Patrick Debois) is continuously eroding. More and more folks in the industry are using it to refer to modern practices of system administration. Those that involve cloud, automation and – in the best case – also continuous delivery.

That’s why I almost never relate to what I’m doing as ‘DevOps’. I rather use the term “Software Delivery Optimization”. We’re optimizing the time it takes to deliver software and the quality of the resulting product. By applying systemic analysis. By measuring and transforming the four main entities : people, processes, tools and information flows. By introducing the latest technologies, constructive dialogue and the evergreen practices of continuous learning and improvement.

Optimization?

But optimization itself is a tricky term. To optimize means to make something better. And what is better? One could argue that in business terms the ultimate good is shown by the bottom line. More profit means better, less profit means worse. On the other hand – one could also say that delivery isn’t the only thing that influences the bottom line. Marketing, sales, financial management and finally macroeconomics – they all can save or sink your ship.

So if we cannot measure software delivery by the bottom line – what do we optimize it for?

Our main mantra of DevOps transformation is “Deliver Better Software Faster” and we used to say that it means we’re enabling speed of delivery without compromising quality.

But experience shows that this explanation enforces one very problematic bias. Namely – fixation on speed. It’s problematic because it negatively impacts quality. This has been proven by research. DORA’s findings show that many organizations adopting DevOps practices improve speed at the cost of decline in quality.

And it is quite natural. Speed and quality are after all eternal enemies. One can not really focus on both. It would be like praying simultaneously to god and satan and hope they both help you.

Even when we say they’re both important – deep in our heart we believe that one of them is stronger. We make this choice because this is how our mind works.

The Need for Speed

And more often than not – we chose speed. So why is it we’re so crazy about speed?

This stems from two sources. First of all – the semantics. Delivery is the act of moving an object from point A to point B. Motion is measured by speed. So common sense tells us that optimizing delivery means optimizing its speed. Measurement defines behaviour after all.

And then – there’s the cultural background. Modern western (and increasingly – global) mindset is in general much more about speed than quality. This phenomenon has been labeled “throwaway culture” to describe a society which is focused around “excessive production of short-lived or disposable items over durable goods that can be repaired”. The trend can be traced back to the 1920ies when the manufacturing strategy known as “planned obsolescence” was defined. The idea is basically to build obsolescence into products, so the consumers need to buy new ones sooner. Manufacturers don’t like mentioning it but we all know there’s a reason why we need to buy new smartphones, TV sets and even cars every 2-3 years. The marketplace is a race track and its not quality but speed that defines who wins.

As an example – who remembers LG G4? It was a beautiful phone but it had a tiny problem – on a large number number of devices the motherboard would just die unexpectedly. Sometimes after as little as a few months of use. LG admitted there was an issue but you could only get a replacement if you purchased the phone directly from them. Replacing the motherboard was very costly and all repair shops recommended against it. They said the new board would be as prone to getting bricked as the original one.

But that’s all water under the bridge – LG has since released G5 and G6 and hardly anyone remembers the G4 flop now.

Which once again comes to show that in modern reality – the speed with which you’re moving is more important than the quality of what you do.

So does this mean it’s ok to ignore quality and concentrate on getting to production faster? If you’re a startup looking to get acquired and your system is relatively technologically simple – the answer may be yes. All you need is to prove your business model – so thinking about quality and reliability isn’t that cost effective. Especially if you’re short on cash and consequently – on time.

But if you’re looking at building any type of sustainable business and your systems are relatively complex – you can’t really afford to ignore quality. Or else you will find yourself in neverending firefighting and at constant risk of losing potential and existing customers. You have to maintain a basic level of quality – if not for your customers, then at least for your company’s sake.

Gosh, we’re back at square one! We want both speed and quality! How do we get there?

Screw Speed, Focus on Quality!

Consultants love beating you on the head with the stick of paradox, so they will tell you: “in order to get speed, you need to optimize for quality! And then speed will happen all by itself, without you even noticing. Try and see – it’s like magic!” I actually used to believe this to be true myself.

But I don’t anymore.

Because just as quality – speed per se is not a differentiator! They are both basic needs. They should be the properties of your system, not its strengths.

As they say – those who are always in a hurry are always late. The only businesses who should really care about speed – are those who are always lagging behind. And LG mobile phones business is a great example here. Up until now all they were trying to do is catching up with market leaders – Apple and Samsung.

Speed is important – but it’s not a competitive advantage.

Optimize for Adaptivity

I actually started this post during a family trip to Japan. Walking through heavenly beautiful temple gardens, listening to the stories of monks who invested years and even decades of their lives into creating them. These gardens aren’t forced on the surrounding landscape – instead they are fit into it. Built around the beauty of wild nature with careful thought and delicacy. They aren’t defined by the speed with which they were created but by their ability to adapt to the changing reality.

And that is our lesson here. The business value of DevOps lies neither in speed nor quality (even though it does help you improve both) – but in adaptivity! In your ability to experiment, innovate and adjust your processes and tools to the changing reality. It’s about the flow, not the rat race.

This focus on evolving, changing, shifting systems is omnipresent both in business and technological landscape of our days. The principle of enabling permanent flux is so important that we are ready to give up the confidence provided by universal rules and strict boundaries.

We’re ready to embrace the complexity and fragility of distributed, modular systems in order to allow each module to change independently when needed. We’re ready to loosen control and treat our systems and organizations as natural entities that mutate and grow on their own. While we take the role of enablers – removing the obstacles to this growth.

This has been the approach of learning organizations as defined by Peter Senge back in 1960-s. And we now see the same pattern in our technologies – getting ever more decentralized, event-driven and evolutionary.

So coming back to the original subject of the post: when optimizing your software delivery processes – optimize for change and adaptivity, not quality or speed. Optimize for the evolutionary approach, not for the long term confidence.

Can We Measure This?

But how do we measure this? We know how to measure velocity, we know how to measure quality. In fact that’s what our DevOps Flow Metrics are focused on. And this post is also a warning! Always remember – our actions are defined by what we measure. It’s important to measure speed and quality to make sure we’re getting better and not deteriorating. But don’t let these measurements direct your decisions!

If we want to optimize our ability to change – we need specific metrics to measure the ease of introducing, testing and delivering such change. And we’re not talking about planned, approved changes sitting in your backlog. We’re talking about emergencies, unexpected market conditions, natural disasters and sudden business opportunities. We’re talking about testing very crazy ideas that can disrupt the market or make our products significantly more usable.

We always say that part of DevOps ROI can be measured by the reduction in the amount of unplanned work. But if everything goes according to the plan – you’re probably playing it too safe. So we do want a certain amount of unplanned work to be present. And we want to be able to absorb that unexpectedness smoothly – without stopping everything else, without staying at the office until 2AM, without burning out.

That’s why it’s these outliers we want to be able to optimize and look out for. That’s why the most important question when designing our information flows should be “What if?”. What if I want to redeploy this system in the middle of the night on Saturday? What if I need to release a patch by 12am tomorrow? What if there’s a critical bug found in an older version? What if our testers need 2 more clusters to test the new feature? What if the requirement is incorrect and we need to introduce last minute changes?

And what we want to measure is the impact of these outliers on our overall ability to deliver. It’s not – reduce the amount of unplanned work, but rather – reduce the impact of unplanned work on our quality and velocity. We need to measure the correlation between the amount of impactful change and our overall quantitative and qualitative KPIs.

Out of that – other questions arise – like: how fast are we able to make educated decisions regarding what change is needed and how impactful it is? But that is a subject for a different post.

And how are you measuring your software delivery performance?