Jeff Sussna is an internationally-recognized systems thinker and IT expert, combining engineering expertise with an ability to bridge business, creative, and technical perspectives. He is Founder and Principal of Ingineering.IT, and will be speaking and leading a workshop at this year’s Managing Experience Conference in San Francisco, March 29–30. I sat down with Jeff to pick his brain on putting the ‘service design’ in software-as-a-service.
What intrigues you about the MX Conference? Why is this an event that you’re interested in speaking at and teaching at?
A couple of things intrigue me. One is that I specialize in helping IT organizations and digital businesses bring together agile, DevOps, and design thinking in order to adopt new methodologies and be able to deliver more continuous value, so I’m really interested in the relationship between design and engineering and design and IT in particular. The opportunity to cross-pollinate ideas and engage with new kinds of people is really interesting to me.
The other thing is, I think a lot about the relationship between designing and operating systems and designing and operating organizations. I’ve come to believe that they are more and more inseparable. Being able to mix in not only connecting with design, but connecting with design management and thinking about design organizations is such a bonus for me.
Very often, DevOps and design never speak. Given your background in IT and DevOps, how did you come to realize that design was a subject and a function that was meaningful to you? How did you make the connection that not only do these teams need to be talking to each other, these activities should be integrated?
Well, first, I actually have a liberal arts background. I grew up looking at pictures of Frank Lloyd Wright buildings and trying to read my dad’s architecture theory books. I was a liberal arts student in college, so I’ve always had a design/art/humanities bent to the way I think about things. A few years ago, I picked up Tim Brown’s book, Change by Design, and I was introduced to design thinking, and I went “Ah! I’m not a designer in a traditional sense but this maps really well to how I think about things and how I think about IT problems–how do we come up with new systems, how do we improve organizations.”
There was a great post just today by someone from New Relic, which is an IT monitoring product company, and he was talking about how they learned that they had to use UX design and design thinking principles in order to do system architecture, which is something you would think would be as far from design as you possibly could get. But what we’ve all been realizing in the DevOps world is that we don’t just make systems, but the process of making systems is a human activity, and we need to apply user-centered design to that activity. For me, it was a combination of my natural tendency of how I think about things, and also getting formally introduced to design thinking and more specifically, service design, because everything in IT these days is about service. Once I got introduced, it became clear to me that we’re designing services just like anybody else.
One thing that the service-oriented architecture folks in IT have figured out is that you cannot succeed at breaking your architecture into these nimble, loosely coupled pieces unless you do the same thing with your organization.
There’s this notion of a service-oriented architecture, and the way that service tends to be used in technology realms is a lot around the ability to transmit data and APIs. Instead of large monolithic applications, you’ve got a lot of nimble things talking to each other. How does a service-oriented architecture technology approach connect to service design? Is there a grand unified theory that ties these two seemingly different types of “services” together?
This is exactly what my talk is going to be about. One thing that the service-oriented architecture folks in IT have figured out is that you cannot succeed at breaking your architecture into these nimble, loosely coupled pieces unless you do the same thing with your organization. This is where organizational design and system design really become inseparable. Secondly, once you do that on both levels you now have to think about: great, we have all of these little pieces, how do we put them back together so that what we end up with is still a coherent company or coherent service or coherent website? It makes the relationship between the customer journey and the service blueprint more interesting: less binary, less one-big-chunk-in-the-front, one-big-chunk-in-the-back, and starts to become about lots of little pieces. And because you have lots of little pieces both in terms of teams and systems, you can be nimble but you still have to figure out how to still deliver a coherent end-to-end-service experience.
So much of your work is oriented on DevOps and IT, but when you do engage with designers, what are key things that they simply don’t understand that you think they ought to know if they want to build systems that work?
I think designers need to understand that as we move into this cloud-world where everything is a system and a distributed system, we encounter new kinds of problems that designers have to help people with. I’ll give you a perfect example, and I’ll pick on Apple for this: when I go to something like Keynote or Pages on my iPad, I’m working locally and my data is getting synchronized with the Cloud, and that synchronization process is invisible. The problem is that I need to understand it on some level, because I need to understand that it doesn’t always work, that sometimes it’s slow, that I might have a version on my iPad that doesn’t match the version on my laptop but it will five minutes from now, or maybe it won’t an hour from now because there’s a problem. And I can’t see any of that, and I can’t understand any of it, and it causes a lot of frustration and a lot of “Why isn’t this? This is supposed to be magic and invisible!” The problem is that when you get into a much more complex world, which now is the world of the Cloud and will soon be the world of the Internet of Things, design has to help people navigate distributed systems, and that means that on some level designers need to understand what’s different about distributed systems and how to help people navigate them.
In the past, designers designed for the happy path, and, sure, made sure they understood the “error or edge cases.” Now, given the complexity, those error and edge cases are likely to increase, yet they’re unpredictable by their nature. Do we now have to come up with a thousand different wireframes to address all the different ways this might fail? How do you design in a way that is flexible enough given the uncertainty?
You do two things: you start with the assumption that failure will happen and you need to think about it as part of your design process. Second, you assume that there are failure cases that you can’t know about and that may not even exist yet. What some companies are doing is they’re actually starting to go out and look for them, and this is where the neatly named “Chaos Monkey” comes into play. This is something that Netflix built where they will actually go out into their production systems while people are watching movies and intentionally break things—they’ll shut stuff off. And the interesting thing is that they don’t consider it a success when they run the Chaos Monkey and nothing goes wrong. They consider it a success when something does go wrong, because it means they found something that’s out there but they didn’t know what it was, and now they know more.
Again, this comes back to the idea that design needs to become continuous: that we design for failure as best we can, and then we learn, and then we design for some more failure. And our systems get better and better as we go. There’s a certain humility that we have to have. It’s interesting if you look at design thinking—it’s a very wonderful approach but there’s a certain kind of industrial aspect to it with the notion that you can iterate your way to something that’s good and finished and who knows, maybe you’ll design a coffee mug that you can sell and people will buy for the next 40 years because it’s so wonderful. In the complex digital services world we have to start, to some degree, to give that up and assume that our solution will continuously fail, and we’ll have to continuously evolve it.
One of the conversations I find myself having a lot is around product teams; there’s this idea of the tri-force: a product management lead, a design lead, and an engineering lead working together to define the problem and articulate what success looks like, and then they work with a larger team of practitioners to build the solution. DevOps and IT folks are often considered plumbers–once the tri-force has figured out what to do, IT and DevOps make it go.
I’m curious, given your orientation outside of that group, how do you feel this needs to evolve? Who needs to be part of that conversation without making it so there are so many voices that it’s hard to have a conversation? How do you make sure, internally, the conversation doesn’t get bogged down and is respectful of all different things that it takes to appropriately deliver a service these days?
A couple things. One is, on some level I think we need to stop looking at Dev and Ops as just plumbing. We need to cross-pollinate them with design and project management and we need to do it further to the left of the process. In terms of how do we do that, I agree, without seven meetings a day and 17 people all collaborating, I think the answer is shorten the feedback loops. Let’s say we have the same current model where product design go off in the corner and come up with some brilliant idea by themselves, and the impact of that brilliant idea is going to be a drastic degradation of website performance. If you find that out as part of your feedback process quickly and easily it’s a lot less painful than if it’s part of a three-month process, and you build it and put it out and find out that there are all of these problems that you should have dealt with upfront.
Second, one of the things that DevOps is trying to get at is that when we develop various kinds of metrics and various levels and we propagate them so that everyone can see “Oh, I made this design change and network performance went to shit. That might be a bad idea.” The thing about feedback is that if you don’t get information back, it’s not feedback. And we can build mechanisms where people can share the implications of what they’re doing so that we can start to see each other’s concerns and each other’s level of work without always necessarily sitting in a room together talking. And that happens much more quickly.