In our latest #TechTalks episode, Zoe Cunningham is joined by Christie Wilson, software engineer at Google, to explore the importance of continuous delivery, how it started as continuous integration during the 90s, and how to use it now to improve your work. (& enter our competition below!)
Christie Wilson is a software engineer at Google, with over a decade of experience dealing with complex deployment environments and high-criticality systems. She is a frequent speaker on CI/CD at conferences including KubeCon, OSCON, QCon, and PyCon. At Google, she built internal productivity tooling for AppEngine, bootstrapped Knative, and created Tekton, a cloud-native CI/CD platform built on Kubernetes. She is the author of her new book, Grokking Continuous Delivery.
Competition giveaway: Continuous Delivery
One lucky reader can get a copy of Christie Wilson’s book, Grokking Continuous Delivery, for free!
- Share this podcast’s link on your LinkedIn, Facebook or Twitter platform
- In your message, tag Softwire
- and add the hashtag #continuousdevelopment
The winner will be drawn at random from all entries on Friday 24th December 2021 at noon (GMT) and announced on Softwire social media channels.
Zoe: Hello and welcome to Softwire #Techtalks. Today, I am delighted to welcome Christie Wilson who is a software engineer at Google, an expert on continuous delivery, and the author of a new book Grokking Continuous Delivery. Hi Christie, can I ask you to tell me a bit about yourself and maybe also an interesting fact about yourself
Christie: Hi, Zoe, for sure. I’m a software engineer. I currently live in Vancouver, Canada. My interesting fact is I’ve been trying to learn to do handstands for about seven or eight years and I still can’t do it. But I’m not giving up. I’m confident that one day I’m going to get it.
Zoe: That’s a fantastic fact. Can you measure your improvement? Can you see that you’re better at handstands now than you were when you started?
Christie: Sort of actually. What I can say is that I was better maybe a year and a half ago and I’ve gotten worse since then. But I’m still optimistic about the whole process.
Zoe: Could you do them when you were younger?
Christie: No, I never could actually, I think that’s something I would encourage kids to learn to do, because I think skills like that. you get them when you’re young, I think they seem like they never go away. Learning as an adult is definitely an extra challenge.
Zoe: Oh, interesting. I was actually feeling the opposite when I ended up trying to do a headstand again. You get people who think their headstands are a really big deal. When I was a child, everyone just did them all the time, and then as an adult, you’re like, “Oh my God, my head is going to collapse, what am I doing?”
Christie: You have much further to fall as well.
Zoe: That is it. Tell me a bit about the history of Continuous Delivery.
Christie: Sure, it’s interesting. It kind of goes back, in some ways further than I would have thought, but then at the same time, it’s also all relatively new. Continuous Delivery started with continuous integration, which people have probably heard a lot about. That, as an idea kind of came around in the 90s, in 1994.
But it wasn’t until about 1999 that it actually got a definition, which was part of the whole Extreme Programming movement. Then, the continuous delivery and continuous deployment are pretty interesting because people confuse them a lot, which I think is actually because they kind of came into existence around the same time. Continuous deployment was defined in 2009 and then, shortly after that, in 2010, continuous delivery and the book with the same name came out.
I actually have a theory that both of these terms were sort of trying to label a very similar movement that was happening, but because continuous deployment, used the word deployment, that phrase has ended up meaning a process where you are literally deploying continuously as frequently as you possibly can. Usually, every commit is going straight out to production.
While continuous delivery is kind of a bit more of a vague term for a set of practices that may or may not include continuous deployment. There’s also the term CICD, which you probably hear a lot which I think is really interesting because when I tried to find out what that meant and where it came from, I don’t actually think anybody created that term intentionally. It seems it’s something that.
I’m getting the impression that people were talking about continuous integration, then suddenly continuous deployment and continuous delivery, and then they needed some phrase that tried to refer to all of it, so they just started saying CICD. That came about around 2014 (which I guess was when I was starting to learn handstands).
Anyways, in some ways this goes back a long time, but then at the same time, these ideas like continuous delivery, we’re talking about the past 11 or 10 years. It’s really pretty new in terms of how quickly things are adopted in software.
Zoe: Exactly, because there’s a difference between one person having an idea and going, Oh, “Hey, wouldn’t it be neat if…” and actually, it becoming widespread. The Adoption of Agile is kind of similar, but the Agile Manifesto was written way before most people were. I mean, it had to be.
Christie: Yes, actually I was just thinking of that, actually. I was talking to a friend recently and she was talking about Agile and she’s like, “That new thing that everyone’s doing: Agile,” and it’s like, well, actually it’s been I think around 20 years now, but it takes so long for people to actually adopt these things and start using them.
Zoe: When we use the term continuous integration, people kind of mean the same thing as what we’re going to talk about now, when we talk about continuous delivery.
Christie: I’d say people use the terms pretty interchangeably. I spent quite a long time looking at the definition, so I have a maybe slightly too pedantic idea of what the different terms mean. In my mind, continuous integration is a set of activities that you do that’s part of continuous delivery. But continuous delivery, I think the name intentionally was trying to take continuous integration and then take it a little bit further through the process.
I think about continuous delivery as two things: One is all about having your code in a state where you can release at any time, safely.
The other part is about making that releasing or deploying as easy as possible. The first part about being in a releasable state to me, that’s kind of what continuous integration is all about. It’s about having changes that go into your software integrated continuously, making sure that they’re actually safe and successful. I’d say that continuous integration is kind of the first big chunk of what you need in order to be doing continuous delivery.
Zoe: There’s much more.
Zoe: Okay, why is this continuous delivery approach important?
Christie: It’s tough to answer that because in some ways, I’ve also been taking the view that we’re kind of all doing continuous delivery to some extent. Because there are many different practices that can fall underneath this, most software companies are doing at least some of them for sure. I’ve also decided to draw a line where, if you’re not using version control, I think you’re not doing continuous delivery, but as long as you’re using version control, I think you’re doing some form of continuous delivery. Kind of like we’re already talking about yoga, actually, very briefly, I think yoga is a practice.
You are continually practicing it and getting better. You’re not really performing it and I think it’s similar with continuous delivery, you’re not performing continuous delivery, you’re doing the practice, and you’re always improving. But I think what makes it really important is that when you do these practices, it really does make you faster. It makes the software you create is more reliable, not to mention the psychological benefits. I think that just taking so much stress and toil out of making software, I think, makes the whole thing more pleasant and maintainable and helps people not end up getting burnt out and wanting to leave the industry.
Zoe: I could not agree more. I’m having flashbacks to supporting a partner organization who did not get source control and insisted on editing their live site and then calling us up to say “It’s not working.” Kind of having to do this debugging on the live site to go, “No, wait. You’ve just put a loop in here. That’s why it’s not working. Like, what?” I totally relate to that and actually, reliability is about saving time in the long run. I think a lot of the cheats people are trying to do are like “How can we do something really quick now?” but when you’re making those kinds of quick, risky choices, overall, it takes you longer.
Christie: 100%. I think that that’s a big misconception about these practices as well. When you start looking at the overhead of getting these things up and running and then maintaining them, I think it can feel like it’s going to make you move more slowly. But the truth is, when you take shortcuts in order to get things out quickly at the beginning, it catches up with you eventually. You can go faster for a little while, but then you’ll hit some kind of limit where, suddenly, you can’t add new code to the existing codebase because it’s too complex. Or suddenly, something that worked well when you had maybe only four people working on the codebase doesn’t work when you’re trying to steal the 50.
Zoe: Yes, absolutely. Have you kind of come across any common issues that you can avoid or that would have been avoided if people had used continuous delivery?
Christie: I feel a lot of it comes down to when you’re not actively trying to improve these practices. I think you often end up in a state where you have to be very careful about what you’re doing. If your code is broken or you don’t know, because you haven’t been actively testing it, you have to be very cautious when you do a release.
That means you need people to be kind of focusing on that release, they need to be putting time aside, they need to be doing things like code freezes, which then kind of stall the rest of development. And when there are problems, it again comes down to people having to pay the price of working the long hours to sort through those to get the releases out, to monitor the releases and fix anything that comes out.
Additionally, by having to do that kind of slow down, it makes it harder and harder to add features and people become nervous too. You don’t want to touch that code that’s there because the last time somebody touched that there was a huge bug, so we’re just going to leave that alone because it’s too complicated. And people work in different branches, they hold their changes back and then they end up dumping these massive changes in all at the same time probably because a release is coming up, which just leads to very expensive, hard to fix bugs.
In the end, I think all of this becomes very expensive, especially in terms of people, which is interesting. I think because people or software engineers, I think, are one of the most expensive resources that you can have and they’re the hardest to find.
It’s much easier to pay for some cloud computer time to run your automation than it is to find more software engineers and I think it’s also just very frustrating for everyone involved. It makes your job less pleasant and kind of reduces the fun factor.
Zoe: Right, and also software engineers are, in some ways more temperamental than computers. Lots of people think of the hardware as being a bit temperamental, but actually if you stress out a software engineer, you’re not going to get the same results from them and particularly around creativity and imagination. Exactly those things we shut down when we’re under stress.
Christie: Exactly, and I think it also can lead to a culture with a lot of fear as well. Like you said, I think when you’re afraid, that’s not when you’re able to actually make creative choices and find innovative ways to move forward, that’s when you become very rigid and you end up stuck with the way you’ve been doing things all along and everything can stagnate. One stressed out engineer also that ripples out to everyone around them in the organization as well. I think when you’re not focusing on these processes and maybe you’re instead just focusing on the next feature, things can become very toxic I think after a few years of that kind of approach.
Zoe: One of the things I love about your book, particularly when thinking about how large a topic this can seem, or what a lot of work it can feel like to set something like this up. I really like how clearly you set everything out and you define everything and say “This is what it is.” Can you tell me a bit about continuous delivery pipelines and what are the basic tasks that would go into a typical pipeline?
Christie: Regardless of what you’re doing, you’ll end up seeing variations on the same basic set of tasks, which I would say are usually there’s the CI side, and then there’s the rest of the process. On the CI side, there’s static analysis or linting of some kind, and then there’s testing, which would be broken down probably into faster tests from the slower tests and run the faster tests more often. As a basic task in your pipeline, though, you’ll see tasks that run unit tests, integration tests, system tests. That altogether is your CI building blocks.
Then on the other side, there’s usually something that’s a task that’s doing some building or publishing, which will depend on what you’re working on. I think the latest is to be building and publishing images, but that’s certainly not the case for everybody and then some task to do the actual deployment. Something that automates what can often be a very manual process.
That’s the set of tasks that you’ll see, and then you can see all different combinations of them or sometimes you’ll have one large pipeline that’s doing all of those things. Sometimes you’ll see the CI part being done on its own. Then the parts where you’re doing the building, publishing and deploying, that might be a separate pipeline that you’re only invoking once in a while, but those are the basic building blocks.
Zoe: Definitely something I’ve seen is that everything feels a lot easier and more approachable when you’re building a new system and you’ve got everything’s green. They’re like “Okay, we’re going to have these amazing tests and these amazing processes.” Imagine the situation where you’re like upgrading your processes on a system that you’ve already been supporting. How would you go about creating a new continuous delivery pipeline for a legacy system?
Christie: A few things come to mind. One thing is that I would also try to keep in mind that sometimes that act that can seem really daunting. We’re never going to get to, you look at some company that’s just started with a totally greenfield project and they’re doing blue-green deployments automatically on every commit. You look at that and you think “We’re never going to get there, that’s depressing.”
The thing is that really, you’re in a state where anything you do is going to improve where you currently are. I think there’s a lot you can do that can add value very quickly. But I think part of it is also realizing that you may never get to everything and it’s a legacy system and you don’t necessarily need to touch everything.
I think that an approach that I don’t hear mentioned enough is to isolate parts of the system and identify. I think you want to prioritize whatever’s the most painful thing. There was a phrase in the 2010 book about continuous delivery that says something about “bringing the pain forward in order to reduce it.” I think that that’s a really important approach. It’s counterintuitive and tends to be the opposite of what people do when they see something that’s very painful or difficult in their processes.
I think they tend to put it off. They tend to be like “Okay, we’ll do that. Releasing is awful. We’re going to do that once every six months.” I think taking the opposite approach and looking at this legacy system and identifying what is the worst thing about dealing with delivering this software, and then focusing on whatever that is and making it happen sooner and more often.
I’d take a guess that it’s probably something about tests. I think that adding some more automation around the tests is usually worthwhile. Again, even something like dividing up which tests are the ones that run quickly versus the ones that are taking hours to run and then maybe running those really quick ones really frequently. Then at that point, you’re in a state where you’ve isolated the slower tests, you’ve got your faster tests, and you can work on building up more of those faster tests to cover whatever changes you’re currently making.
If you’re adding new features or you’re fixing bugs, you can focus there, and then you may never get to fixing those slower older tests, but you’ve actually significantly improved things with what you can do. I would advise definitely focusing on whatever is the most painful thing, and then trying to do that sooner.
Then, also not being too distracted by trying to get to whatever the latest technology is. I think that you have to set your expectations a little bit lower. I guess the last thing I would say is that if you do have some new project that you’re working on, that can be a great place to experiment with some of these new technologies. Maybe you have this massive legacy system, but you have some new service you need to write. Maybe that’s a place where you can experiment with trying out some of these new things, even if you can’t necessarily update the entire existing system.
Zoe: That’s super interesting because actually a lot of it boils down to facing up to reality. Actually, as human beings, we sometimes blind ourselves to what can be done by thinking about the ideal. Like you say, if you’re already not testing your code well and releasing in a safe way, every step you make is like an improvement. I really love this idea of bringing the painful words rather than building up technical debts, almost like you’re doing some technical investment that will pay off in the future.
I was also reminded of yoga and practice, that actually, this isn’t about being perfect. Although we’re aiming for green lights, it’s not actually a ‘you won, you didn’t win’. It’s like how do you keep improving and keep doing the right things so that eventually, you improve slowly?
Christie: Exactly. You might not be doing a handstand in the middle of the room, but for you, just maybe getting through one class or showing up even, that might be the big accomplishment. I think a lot of this comes down to people comparing what they’re doing to what other people are doing and having an idea that there is a perfect way to do things. I think the reality is that you need to do whatever works for your company, the people you are working with and the software you are creating, and that might look quite different from what some other company is doing.
Zoe: It’s reasonably likely that it will, because software is so complex, and actually, every time I discuss software, it’s like it’s such a fascinating field, because we’re trying to find general principles for something where it really does boil down to the details so often. You need to know all of that detail before you can make the best decision. You can’t have rules, but you can have principles, which I think is really just interesting.
Christie: Right. That’s one thing I found very interesting. I started looking into some of these definitions around continuous delivery. Another term that I saw it referred to as was a discipline. I thought that was a very interesting term because it can actually mean a couple of different things. You can think about it in terms of someone’s done something wrong and they’re being disciplined, or you can think about it in terms of something even like a meditation practice where you discipline yourself to keep showing up. But you’re not necessarily one day, maybe you’re very distracted and you’re not doing it well, but you’re consistently applying this discipline of observing these things and trying to improve them versus trying to get to any particular end state.
Zoe: Very cool. You’ve mentioned a few times automated tests, and even this simple concept of separating out slow tests and fast tests. I think that writing good tests is just something that’s super, super hard. Writing and then structuring it and making this plan for how they’re going to run. What’s your advice in that area?
Christie: I love tests. In the very first part of my career, I didn’t write or encounter any tests and I feel like everything was really hard. But then, as soon as I was able to work in an environment where there are tests, I just felt like I guess everyone has different approaches. For me, it just gave me so much more confidence because I suddenly had this way of making changes and then verifying them at the same time.
I guess a couple things come to mind: One is the classic test pyramid approach. People neglect this pretty often, but I really do believe that you’re best off to make most of your tests unit tests. Unit could mean a lot of things, but the general idea is that each test is trying to test one thing specifically in isolation from other things.
I think that people often undervalue those tests because you see lots of memes where it’s like unit test pass, but there’s two doors that can’t open. You test the things in isolation, but then when you put them together there’s a problem. I think that’s really where the rest of the tests come in. I think the role of the rest of the test is really to test the gaps between those unit tests.
In my experience, if you have pretty good unit test coverage, which is probably something like 70 to 80%, at least, and then you have a sprinkling of integration and system tests over that, I think the coverage that you get is actually pretty phenomenal, because the reality is, you’re never going to cover everything. Even trying to absolutely every single case is just not worth the investment. Of course, depending on what you’re doing.
If the software you’re making is people’s lives are at stake, it might be worth investing in the system test more of the system test, which are the only test that’s actually going to test the thing like the users do. For most of what we’re doing, you can achieve, I think a pretty good balance with mostly unit tests, a sprinkling of the other tests.
The other thing that I think people do is for some reason, they treat test code as somehow less important than the rest of the code that they’re writing. I can understand why, because it’s not delivering the feature or the business value directly, but it is still code that you’re going to have to deal with and maintain over time.
I would recommend treating it the same way as you do the rest of your code, which means organizing it in a way where you have kind of low coupling, high cohesion. You’re really thinking through, like when you make functions that support your test code, you’re treating those like the same sort of libraries you would write for any other part of your software. You’re organizing them well, you’re naming things well.
Then also looking at that code in Code Review. I think that part gets neglected a lot because you’re reviewing a change, there’s a little change and then there’s this huge mountain of test code. I think that we tend to skip looking at the test code, but I really think it is worth investing in tests in the same way as you do for the rest of your software.
Zoe: I don’t know if I sound like hopelessly naïve saying this, but something that always struck me about tests is it was always like “But who tests the test?” I totally understand that I could make a mistake when I’m coding, but I can also make a mistake when I’m coding the test that tests my code to start with.
Christie: No, that’s a 100% true. I think it’s a diminishing returns thing. Maybe you would, again, depending on what you’re doing, perhaps writing test that test your test code could be worth it. I think that’s actually, you see a variation of that in approaches, I think like fuzzing or things where you intentionally introduce sort of noise into your system and then you can even see how well your tests stack up. Like, do they actually catch these problems?
Another thing I would say is that I think that maybe testing the tests themselves is probably something I usually wouldn’t do, but I would test the code that supports the tests. If you’re writing tests for something and you find yourself writing functions and then eventually there’s enough functions that you have some kind of maybe library or package with even like three or four different functions that you’re calling in a lot of tests, I would probably write tests for those functions, just to make sure that they’re well supported.
Again, the other thing with tests is that you you’re just never going to get everything even the most well thought out tests. You’re probably only going to catch the problems you were actually able to think of. There’s always going to be something hiding somewhere. It’s just a matter of determining how worth it is to try to root that out.
I think that’s where human testing comes in too, where people doing a QA role, trying to use your system and think of novel ways to find problems. That’s where you’re going to find the things you couldn’t anticipate with your automated tests. I think no matter what level of automation you have, there is still a role to have for a human who’s creative in ways that tests never can be trying to interact with your system.
Zoe: Right. Then perhaps bringing us right back round to the start. That’s something where you can put that within your continuous delivery process. You’re not integrating the person into your system in the same way, but actually it is part of this pipeline of how things get released.
Christie: Right. Again, I think it depends on what you’re doing. I managed to never mention the project that I work on, so I feel like I want to mention it at least briefly. I work on an open source project called Tekton, a continuous delivery system built on Kubernetes. We rely almost entirely on automated tests for our software.
At the same time, there are things that we never catch and they usually end up coming down to users because the users are kind of doing the QA in that process. Like you said, depending on what you’re making, I think it can make a lot of sense to have some kind of QA phase in your process. I guess what that often ends up looking like is you can have pipelines that have explicit manual approval steps, maybe where it needs a person to go and verify something. I think another way that you can do it is by doing it in parallel as well.
Because again, if you have really good test automation and really good coverage, the kinds of things that you’ll have people finding are going to be very novel things. If you have people following through a script of they’re trying this and then this and then this every single time, I think finding a way to automate that is a good idea and worth the investment. But what you really want is people experimenting and trying totally new things. You could also have that going on in parallel with the rest of your efforts.
Zoe: Absolutely, use the human beings for the things that human beings are good at, which is doing crazy stuff that no one could ever have thought of.
Christie: Right, exactly.
Zoe: That’s been absolutely incredible. Christie, thank you so much. I hope that’s gives people an overview of why this is so important, how it can be useful and how to get started. Obviously to find out more, you should read Christie’s book, which is amazing. You can find the link in the text that goes with this podcast. Thank you so much, Christie.
Christie: Thanks Zoe. It was really fun.