Execution Speed vs Tech Debt: a Balancing Act?
A guide to up-leveling technical decision making
(I’ve been wanting to restart this substack for a few months now, finally managed to defeat my writer’s block. Look forward to any and all feedback! — Krishna)
You just got to product-market fit, raised several rounds of institutional capital, and your customers are hungry for your product. Still, you're unable to move fast because your technical infrastructure is crumbling and shipping velocity has dropped off a cliff. Technical debt is piling up, and you have to take three steps back for one step forward. Sound familiar? This story is more common than you think. In recent weeks, I've had a lot of conversations with founders and senior engineering and product leaders, and one common theme I've noticed is the struggle to improve the quality of technical decision-making while maintaining execution velocity.
Why and when do you worry about it?
Growing pains are still painful, and when your technical infrastructure can't keep up with your ambition for the company - it's often a sign of product market fit; though still a formidable chasm to overcome on the way to building an enduring company. This manifests itself through slow down in execution, losing customer trust due to defects, lack of scalability of the systems, or even productive output from your teams due to playing whack-a-mole. In today's world, with AI systems changing at the speed of daily news, the ability to balance speed and quality is even more important.
"Tech Debt" is a dreaded word in every startup's lexicon. In the early meanderings for achieving product-market fit, speed is of the essence, but when a company hits the inflection point of PMF, this debt piles up. All those intentional and unintentional decisions made in the interest of speed come back to bite us. The challenge that I've seen companies face is the organizational muscle to deal with this is not developed yet, the technical talent doesn't exist or needs upskilling and upleveling -- company leaders need to intentionally prioritize thoughtful technical design and decision-making once they see clear signs of "pull" from the market. "Go slow to go fast" is the pithy adage that comes to mind.
The journey to up-leveling
Oftentimes, the journey to discovering what works falls into a few categories.
Hygiene: The first place to check is if basic hygiene exists and if there are things that can be firmed up therein. Paying off technical debt is often a long road, and improving basic systems like code review practices, deployment machinery, observability stack, etc. are quick interventions that help extend the runway in this journey. Depending on your company, this may or may not be a solution but low-hanging fruit is always worth the squeeze.
People: The second question is one about people. In the early stages of any startup, people get trained to move fast. However, when you start finding things crumbling, often it's a good time to wait and ask "Who is the final arbiter for technical decisions" and "holding the bar" for what gets shipped -- and if this expectation has been explicitly set with this person: let's call this person the "Chief Architect".
It can be a senior IC or a people manager, but it's always good to be explicit. If this person doesn't exist, or the right leader is unable to make the transition, it's important to supplement the team.
As a company that's scaling, you want to avoid single points of failure (just like in your technical infrastructure), and it's important to think about building and developing a bench of architects and principal engineers who can hold this bar in their sub-areas as teams scale. Building the bench is a slow process, and it's a good idea to get started as soon as you think you are on the path to building an enduring company.
All of this should flow back into your reward and recognition processes and ensure that you are incentivizing the right behaviors
Process: If you have the right people in place, ensuring that there's a process to align on the right decisions is in place. This is often achieved through an "Architecture Council" or "Design Review Process" but thinking through the details with your own business context in mind is important. A few examples that come to mind:
Who should be a part of the architecture council? You typically want representatives from various teams/organizations so people are thinking through end-to-end use cases. You should have a designated "bar-raiser" who doesn't need to play nice in all the reviews.
When should a decision be brought to the council? Gating every small decision can slow down decisions; in some teams (e.g., PLG growth teams) it's probably not even worth it. It's good to define the shape (e.g., complex, x-team) of projects that you want brought to the council
Are decisions gated by reviews? Think through if the council has blocking/veto powers, or if it can share non-blocking feedback and which one under what circumstances. Such councils and processes can be a great way to develop engineering talent, so companies will often find value in non-blocking FYI-style reviews too.
What is the SLA of decision-making? I've seen enough horror stories of design review processes taking 50+ turns in large tech companies which saps energy. At the same time, you want to land the right decisions. Evaluate based on your planning and execution cadences.
Organizational Structure: If you have the right bench strength and processes in place but still struggling to make progress, it's a good idea to look at your organizational design to see if it maps to what's most important to your business. People get lost in the here and now, and a jumbled web of goals and reactive work will end up ensuringthat even the best and brightest can't make any progress. A much longer discussion for another time!
Measuring Progress
Now that we have the ingredients out of the way, organizational leaders and CEOs need to make sure they have the right mechanisms in place to measure progress. Unlike product goals, tech debt payoff can often be loosey-goosey and you realize after a year that you are back at square one. I recommend a combination of the following but prioritizing based on your own needs:
Operational metrics for the technical design review process including, volume, quality, and cycle-time
Quality, scalability, and performance metrics that matter to your customers and are easy to measure, as well as time to market for new capabilities
Improvement in on-call burden and capacity dedicated to reactive fire-fighting as well as cohort metrics around defect rate for your releases
Scorecards that you can grok and are important for the business, ensure that teams are making progress on the right line items on the scorecard
Large migration projects should be broken down with risk mitigation or adoption goals for each milestone that you can hold teams accountable for
Developer experience measured through a combination of hard and soft metrics
In Conclusion
Hopefully, this provides a set of playbooks and techniques to consider as you are scaling your engineering teams. As always, the right answers are often context-dependent. Feel free to shoot me a note if you want to talk about it.
A stitch in time saves one—to overlook tech debt early because they need to ship fast is the primary reason of near-drowning into tech debt later. Right team composition and the right coach/advisor are super important for the founders to see the red flags and deal with appropriately.