There is a lot of information about engineering productivity out there. No one says it’s easy, but it can be downright difficult to turn the practices you hear about into plans you can put into action. What follows is an example of how we can create an actionable plan to increase our productivity.
Let’s define engineering productivity as how effectively your engineering team can get important and valuable work done.
- How do you determine important and valuable work? — goals and objectives.
- How do you effectively get work done? — remove wasted time and effort from the delivery cycle
Goals, Planning, and Prioritizing
If productivity is an organizational goal, you need to make sure people understand why and how it affects them. You need to communicate the message over and over in as many venues as possible. The more developers understand the goals and the direction, the more engaged they’ll be with the work.
Engineering teams should try to set ambitious goals, focused through the lens of the team’s mission statement. We also try to create measures for defining success — yes, this is the OKR framework, but any goal or strategy planning can be used. We try to keep goals (objectives) and measures (key results) from being project to-do lists. Projects are tasks we can use to move the measures. Goals are bigger than projects.
Engineers need to clearly understand the importance of their work. Large backlogs of work create decision fatigue about what work to prioritize. Without planning and prioritizing, we can end up with teams that aren’t aligned — the opposite of productivity. Use your goals, even the high-level organizational goals, as a guide to prioritize work.
Removing Wasted Time and Effort
A great resource for exploring engineering performance is the book Accelerate. Based on years of research and collected data (State of DevOps reports), the book sets out to find a way to measure software delivery performance — and what drives it. Some important measures include:
- Lead Time: time it takes to go from a customer making a request to the request being satisfied.
- Deployment Frequency: frequency as a proxy for batch size since it is easy to measure and typically has low variability. In other words: smaller batches correlates with higher deploy frequency and higher quality.
- Time to Restore: given software failures are expected, it makes more sense to measure how quickly teams recover from failure.
- Change Fail Percentage: a proxy measure for quality throughout the process.
Each of these measures could be a goal we want to focus and improve. Each measure has an impact on our ability to deliver software faster with better quality. Let’s also call-out that these measures are somewhat overlapping and interdependent.
Creating an Action Plan
Picking a Goal
As an experiment, let’s take one, Lead Time, and see how we could brainstorm ways to improve it. In a different favorite book, The DevOps Handbook, we’re presented with ways to effect change in Lead Time. A short summary that does not do justice to the depth presented in the book:
- Reduce toil with automation
- Reduce number of hand-offs
- Find and remove non-value time
- Create fast and frequent feedback loops
Let’s think about what’s involved between filing a ticket to start work — to delivering the work to the end user? Many different tasks and activities happen within this cycle. This becomes the scope we can work within. Some high-level things come to mind:
- Designing
- Coding
- Reviewing
- Testing
- Bug Fixing
- Ramping
- Monitoring
Picking Measures
We should be thinking about ways to measure success and failure of these activities. This should be independent of the work we intend to undertake. We can draw upon the pain and stumbles that have happened in the past. Finding good measurements can be a very hard process itself. Let’s not be unrealistic about our expectations on manual processes — we’re only human and people make mistakes. Think about ways to make it easy to succeed and hard to fail:
- Find more defects in pre-release than post-release: We’re always going to have bugs, but let’s try to find and fix more of them before releasing.
- Reduce the times a project gets bumped to next release: This happens a lot and for many different reasons. We should be better at hitting the desired timeline.
- Reduce the time it takes people to be exposed to a feature release: It can take days or week for people to “see” new features appear in the apps when ramping a feature flag. This also makes A/B testing painful.
- Reduce the times a feature flag is rolled back: Finding problems after we ramp a feature in production is costly, painful, and slows the release of the feature.
- Reduce time to detect and time to mitigate incidents: We’ll always have breaking incidents, but we need to minimize the disruptions to people using the product. Minutes, not days.
- Reduce amount of non-value time: It’s hard to say “code should be reviewed in X minutes”, or “bugs should be found in Y hours”, but it’s easier to identify dead-time in those activities.
Brainstorming Projects
With our objective and measures sketched out, let’s think about the activities and tasks we want to change. Some are manual. Many involve multiple teams. There are a lot of hand-offs. Let’s create smaller affinity groups based on the tasks and activities using the framework.
Reduce toil with automation
- Fast and continuous integration/UI testing
- Canary monitoring and alerting
- Simple hands-off deployments
- Easy low risk feature ramping
Reduce number of hand-offs by keeping cross-functional teams informed and involved
- Spec and requirement generation
- Test plan generation and updates
- Pre-release testing setup
Find and remove non-value time, usually the gaps between stages
- Fast edit/build/test cycles for developers
- Timely code reviews
- All code merges ready for QA next day
- File new defect tickets ASAP
- Prioritized pre-release defect tickets
- Merging green code
Create fast and frequent feedback loops
- Timely code reviews
- Fast and continuous integration/UI testing
- All code merges ready for QA next day
- File new defect tickets ASAP
- Fast, short feature ramps
- Canary monitoring and alerting
This level of grouping is perfect to start brainstorming actual project ideas. We’ve started at an organization-level objective (Increase engineering productivity), focused on a contributing factor (Lead time), and created a nice list of projects that could be used to affect the factor. This is important — we’re not focused on a single large project! We have many potential small, diverse projects. This dramatically increases the probability that we will succeed, to some degree. A single project is an all-or-nothing situation and lowers your probability of success. Most projects fail to complete, for one reason or another.
We also see that some idea groups appear multiple times. This allows us to leverage work to create impact in more ways.
If you take anything away from this post, I hope it’s that improving engineering productivity is an actionable goal. We can be systematic and measure results.
Accelerate and The DevOps Handbook cover a lot more than what I’ve presented here. The information on organizational culture and its effects on performance are also very enlightening. I’d recommend both books to anyone who wants to learn more about ways to improve engineering productivity.