I’m a big fan of Adrian Cockroft. He’s currently VP of Cloud Strategy at AWS, and previously worked at Netflix. He’s been giving a talk entitled Innovation at Speed that I’ve written about in the past. His key thing is the Time to Value metric. His thesis is: the one metric to optimize for an innovative company is how much time it takes you from having an idea to putting that idea into production providing value to customers.
That’s it. Everything else will follow.
The infrastructure and mindset you need in place to be able to achieve a short time to value involve various things:
You need to have continuous integration and deployment in place: that is, it needs to be as friction free as possible for an engineer to write some code, have it automatically be tested and pushed into production.
Engineers need to be comfortable with constantly releasing their code. Likely by using something along the lines of trunk-based development, where essentially all code lives on a single branch (rather than having long-lived feature branches), that everybody merges daily or every few days, and is released to some environment that people care about constantly.
There is targeted release infrastructure through feature flags, allowing you to tightly control who gets exposed to changes (e.g. new features) and when.
All of this makes it easy to get new ideas, features, concepts pushed to production fast.
However, it in no way guarantees that what you push live is actually good. The assumption is that you’ll be able to iterate quickly enough to adjust anything that doesn’t work as you go.
This is an assumption that’s worth looking at in a bit more depth, especially in the scenario of changing fundamental features of your product.
In my team at Mattermost, we are working on collapsed reply threads. This feature will significantly change the product, and how people use it, so we need to make sure we nail it.
Then the question becomes: how do we do both — how do we both deliver value fast, but also make sure we nail it?
Let’s start with two extreme approaches.
The first extreme is how software development was done last century: the waterfall model, where every projects starts with a lot of planning, design, documentation. The assumption is: when we nail the spec, we’ll be able to do a single implementation iteration, a single testing cycle and then deliver (in a year, two or never). So the investment is upfront: try to figure out all the flows, edge cases first.
That didn’t work so well. So Agile came along, where each design, implement, test, release cycle was condensed into a two-week sprint, and feedback on the release fed back into the design phase of the next cycle.
The second extreme is Adrian’s ultimate time-to-value scenario. In this approach this entire “Agile cycle” is just a couple of hours. An engineer wakes up with “hey, I have an idea” and gets to “it’s live!” that same afternoon. (And potentially “deleted it, because it was silly” that same evening.)
That works for small ideas and tweaks, but features like collapsed reply threads are likely too complex for that. They require some significant thinking, UX design, implementation effort on various parts of the system (back-end, frontends, infrastructure) to provide any value (and not destroy it). So, we need to find a middle route.
This week I was scanning through some episodes of the Rework podcast, and came across one episode about how the Basecamp people designed “Hey” (their email service). Hey aims to have an innovative approach to e-mail, so it approaches various features differently than other e-mail services. During the episode, they describe how they started the project with months of bouncing back and forth mock-ups and prototypes to figure out how the service would work. That approach makes sense to me, but is that sufficient to avoid significant rework later? And at what stage do you say “ok, our level of confidence is high enough, let’s build this thing?”
Here’s how I look at the stages of validating whether you “nailed” a significant feature:
Mockups: using mockups (interactive or not) you try to get a feel for whether a feature makes sense or not. Of course, you don’t have real data in a mock-up, and you will not detect many edge cases.
Implementation: you start implementing the feature for real. You only run it on development and test environments. Now, more edge cases start to appear and more questions arise. What if this event occurs while I’m on this screen, how would that affect that the other thing? You adjust and iterate.
Production use: this is the first time you use the feature in an actual production(-like) environment with actual data. You now switch from “I’m going to test specific scenarios” mode to: I’m using this in my day-to-day and will find if we nailed it.
At each stage you have to balance effort and risk. Hypothetically (and I say purely hypothetically), it’s possible to fully iron everything out at the mockup stage. Again: hello waterfall. This will take a prohibitively long time. And, spoiler alert: you’ll get it wrong anyway (again: hello waterfall). Therefore, it makes more sense to get a general feel for the feature in the mockup stage, to figure out the flows, but accept that many edge cases will not yet appear.
Actually implementing a feature is expensive, so we need to have some level of confidence this is going to be worth it. As engineers write the code, new questions will arise. What about this scenario, what about that one? Discussions take place, designs are adjusted, specs updated.
The production stage sounds scarier than it needs to be, and can actually be split into multiple sub-stages, through the magic of feature flags.
Narrowly targeted testing. In Mattermost’s case, we internally are all using daily build of the product (called “community daily”), and we could enable the feature for the development team there. Gathering feedback at this stage is easy: the team that works on the feature can judge if it works as expected and adjust accordingly. If doesn’t, only the team will suffer. The iteration cycle can be as short as one day (make change, PR, merge, live the next day).
Broader limited testing: for instance by enabling the feature for all of “community daily.” Gathering feedback here probably needs a bit of infrastructure: an easy way for internal users to report issues (JIRA, a channel).
Cloud: enable the feature on the cloud editions (or at least a subset of instances). At this stage gathering feedback needs to switch to a more behavioral model, that is: observing changes in user behavior based on tracking data. Do people find the feature? Do they get stuck? Do they get back to it? Do they use it?
General availability: enable for all, remove the feature flag.
The art is figuring out in what phase to spend the time. Does it make sense to try to nail this in the UX mock-up stage, or do we accept that we only be able to understand if this makes sense when we get to really use it with production data?
It’s a journey.
In other news, I have some audio-visual bonus content for you this week, as on Thursday I was a guest at STXNext’s livestream. The video of this is up on various channels, including youtube. I enjoyed the interview, and it allowed me to cover a bunch of topics and “case studies” that I haven’t spoken about much publicly. It’s an hour long session, so feel free to skip over things, but here are some of the topics we cover with timestamps:
(3:40) A TL;DW version of my career story (skip)
(7:10) Moving between engineering, management, senior management and back to more hands-on management
(10:20) My interest in reducing waste of engineering time
(11:45) 10x programmers
(16:10) the 100x engineer
(23:10) The backstory behind the “Tinder for Jobs” project we did at OLX Group, and how we applied Lean Startup principles through...
(29:20) Pop-up Driven Development
(40:50) How Lean Startup ideas can be applied in larger, established organizations for features
(43:40) Moving from feature checklists to business impact
This and more hand-wavy stuff, for your entertainment