Automate everything

Date: 10 May 2016

"Everything can be automated, but the true path to automation success lies in discernment of the route and speed of the journey".

So we think Lean is good. We want to eliminate waste and get our Minimum Viable Product (MVP) out in front of customers. We want to do this with as little delay as possible so they can start getting benefit, giving us money and providing feedback on what they want next. We want to be able to deliver small, valuable chunks of features so we get a nice, tight, feedback (and payment) cycle.

We also already know that cross-functional teams built around products are a ‘Good Thing’ – as is removing manual make-work in favour of repeatable automation. Finally, we also have sign off on our voyage of optimisation…

Before we dash headlong into an over-enthusiasm of automation, let’s have a quick recap on why we want to automate, then take a deeper look at the what, how and when we want to automate, and then finally at some alternatives to doing everything ourselves.

So why not carry on doing things manually? We know how to do that and it’s tried and tested. Here are a handful of reasons why manual might not be best:

‘Not enough environments’ – environment creation backed up behind a manual build team
Manual regressions, broken environments and hard-to-track-down bugs from config
‘Works on my machine’
Painfully long and unproductive build times and test cycles
Defects from time-sharing test environments due to their repeated manual reconfiguration

Hold up. I think you get the picture. We’re supposed to be looking forwards!

But what about automation being more expensive than doing things by hand? Yes. But only if you have your scale dialled right down to a small time period or batch size. Building a couple of servers by hand is cheaper that automating the same build. But who only builds two servers, or three, or four or…? So what is the break-even point or ROI? Here’s one viewpoint. Unfortunately, there’s no simple answer as it does depend on your situation. One quick note for later is that the less you invest up front, the quicker the payback.

Premise – automation works out cheaper than doing it manually once you get beyond simple scale.

Now we have the ‘manual bit’ out of the way, we’re good to go! Hold on a second. It’s very easy to get caught up in the excitement of new and shiny tools and techniques. Remember, we still have to get our Product’s MVP out of the door. So what do we want to be careful of?

Going off on an automation spree – leaving everyone else in environment hell
Bad ROI – going too deep or too far on the *wrong things
Forming a parallel ‘DevOps’ team to go off and do the automation – SILOS!

Premise – we want to start getting benefit from our automation as soon as possible without jeopardising our product delivery.

One final anti-pattern – automation is not the goal, nor the end point. It is one axis of improvement for a delivery team to increase the team’s velocity and reduce lead time.

Premise – each team has TWO products – The Product and the Team (and its capabilities) – Justin Arbuckle, Chef

Building a team’s capability through automation is an investment just like product development.

What is this ‘Everything’ that you speak of?

Well, everything that is repeatable and can be delegated to a machine. Currently, thinking and investigatory activities cannot typically be automated. Neither can aspects of user research, design and usability.

There are two further caveats. First, automate in order of ‘what adds the most value’. Secondly, weigh up the cost vs value of each automation increment – aka ‘know when to stop’. Hmm, sounds like a product backlog. It is – and in fact you want to try and combine your ‘product’ and ‘automation’ lists into one, prioritised backlog. It’s key to remember that there is just as much tangible business value from automation (and process improvement) as there is in product delivery.

A slightly more subtle point about cost benefit is that some automated capabilities stack; when a certain capability is reached, the benefits become greater than the incremental benefits of each step.

An example is the ability to stand up and configure a server or application and have it up and running in seconds or minutes. There are many steps to get to this level of automation, each with pay-offs, but primarily, the win is the reduced cost of waiting.

What else then? By dropping instance start-up time, we also dramatically reduce the Mean Time To Recover (MTTR). We can kill off an instance while spinning up a new one. If you are working in the cloud, you know this is important as instances can and do fail, often without warning. We also enable autoscaling; we can switch off instances when load is low and bring them up on demand. And there’s more! In development, we may have a number of environments with the application running – unless we’re a ‘chasing the sun’, geo-distributed development team. We only use these for eight hours a day so we can turn them off at the end of the day and switch them on again in the morning, thereby saving energy and money or re-purposing compute for other out-of-hours workloads.

I worked on one project where – over time – we automated all environments for a lift and shift of a monolithic application to AWS. We patterned our environments and pipelines, then duplicated them horizontally across multiple brands, teams and feature branches giving around 300 development instances. We were able to automatically switch them off every night at 18:30 and start them at 05:30 every morning, just on weekdays, roughly saving the business $45,000 per month just in switched-off infrastructure costs.

Technology trees

Over a couple of ‘devopsifications’, I’ve seen something of a pattern emerging. In fact, it has a very similar feel to the technology tree from the Civilisation games! Here are a couple of examples:

Exactly what your tree looks like in your environment will depend on the technology stacks you’re working with and the capabilities of your existing infrastructure. You will want to try and map out what your tree looks like or bring in help from people who have already done this.

Standing on the shoulders of giants

There is one more caveat to ‘automating everything’. Sometimes it’s better to buy or borrow than build yourself. Some examples are obvious – there’s a high chance you will need some kind of artifact repository. You could roll your own from scratch for the fun of it (but that would be crazy), or you could use a product like Artifactory or a Docker Registry. A deeper infrastructure example might be using community Chef recipes for configuring servers.

The final corollary to the buy vs build in the Land of DevOps is buying or, more accurately, renting services and capabilities. With AWS, Google and friends, there’s now an almost bewildering array of instances, servers, services and higher level capabilities. If you’re in a position to take advantage of these offerings, then you can save a lot of time and expense here. Why would you roll all the automation for a Load Balancer when you can spin one up (and hook it in) at the flick of an API call?

…or a pain in the ‘aaS

Standing on a giant’s shoulders can be great, but if you’re tied on, you have to rely on the giant going the way you want to. At worst, you could end up like Bran, strapped to Hodor when he flips out, running in the dark and bashing Bran’s head against low-hanging obstacles…

Lock-in is a potential risk – and the more bespoke the services, the higher the risk. That said, many IaaS providers offer similar features, pushing the risk towards the tooling you put in place to manage your infrastructure. While AWS CloudFormation is tightly coupled to AWS, HashiCorp’s Terraform is somewhat more provider agnostic.

Mind your own business

Remember, we want to do the next thing that brings the most value and improvement. Are you an infrastructure company? The answer is probably not – in which case, why would you invest so heavily in your own Infrastructure and Automation, plus the ongoing cost of maintaining it? Well, perhaps you don’t have to.

Currently, the ultimate in buy vs build is Platform as a Service. Offerings like CloudFoundry (Pivotal, IBM Bluemix, HP Helion… ), RedHat’s OpenShift, Heroku and AWS Elastic Beanstalk can take away the burden of infrastructure management and also the need for you to heavily invest in automating it. A close relative is Containers as a Service (mainly Docker) with an even more extreme version being running just code ‘functions’ as with AWS Lambda or IronWorker. These platforms offer varying levels of enterprise grade services (such as databases, messaging or discovery) and are all opinionated around the architectures they support and how you deploy to and manage them. What you lose in configurability, you gain in ease of management, use and maintenance, allowing true self-service infrastructure. You do still need to invest in your code build/deploy/test to realise the benefits of Continuous Delivery.

Hybrid or chimera

That all sounds lovely for small, new teams, new codebases where we can design for platforms and new organisations with green fields rolling on all sides. What about the other 99.99% of us? If you already have infrastructure, leverage it by deploying a PaaS like CloudFoundry onto it. You can make you infrastructure more homogenous (cheaper to build and manage), while bootstrapping your teams with self service capability. If you already have existing applications (read monolithic apps), then you probably have to stick with automating into your own data centre or maybe IaaS. Here, having virtualisation in place and infrastructure that follows the jameswhite manifesto will make your journey considerably easier. Then draw up a roadmap heading towards PaaS by splitting off upgraded or new behaviour (as microservices) onto a platform architecture.

Coming full circle

Shortened feature lifecycles, reduced time to market, minimising waste – these are all goals of modern, DevOps-enabled companies. Automation of the facilities and infrastructure which enable product teams is now seen as a key way to attain these goals. However, the gap between wanting to improve and knowing where to start is filled with many different options and avenues. There is no one-size-fits-all solution. With all the technical solution options, the key part of a DevOps transformation is actually the culture and thinking which shape and drive out the technology changes. There are ways of thinking and approaching the problem:

Take advantage of what’s already available
If you can take advantage of a big step change quickly and with relatively small cost, do so
If you’re going to do something yourself, take small, incremental steps
Always do the most valuable thing next
Everything can be automated, but know when to stop

So, with these in mind, if you can make use of someone else’s infrastructure, maybe you should. If you can use pre-made services like databases or load balancers, do so. If you can cut out the lower-value commodity infrastructure and leap straight to a code running platform, try it. If you can shortcut your own journey by benefiting from someone else’s experience, why wouldn’t you?