The Fastest Code Is the Code You Never Run

The performance conversation in software engineering has a peculiar blind spot. We reach for profilers, debate O(n log n) versus O(n²), and argue about whether to use a hash map or a sorted array. Meanwhile, the most powerful optimization available sits largely unexamined: deciding that certain code should never execute in the first place.

This isn’t about dead code elimination (the compiler does that). It’s about a discipline of thought that asks, before writing a single line, whether the computation is actually necessary. The fastest path through code is the one that skips the code entirely.

The Hidden Cost of Doing the Work

Every function call has a cost beyond its Big O complexity. There’s the cost of the work itself, the cost of the infrastructure supporting that work, the cost of debugging it when it fails, and the cost of maintaining it as requirements change. When you avoid running code, you avoid all of those costs simultaneously.

Consider a classic example: eager versus lazy evaluation. Many systems compute results upfront and cache them, on the assumption that the results will be needed. Sometimes that’s correct. But in systems where users request data they never actually view, you’ve paid the full computation cost for zero benefit. Lazy evaluation, where you compute only when the value is actually requested, looks slower in microbenchmarks and faster everywhere that matters.

The same principle applies at higher levels of abstraction. A feature that generates a weekly analytics report for every user, regardless of whether those users ever open the report, is burning resources on work that produces no value. The optimization isn’t to generate the report faster. It’s to generate it only when someone requests it, or better, only when there’s evidence someone will.

Iceberg diagram showing the visible cost of computation above the waterline and the hidden costs below — Every function call carries costs beyond its runtime. Most of them are invisible until they compound.

Short-Circuit Logic as a Design Philosophy

Most developers know short-circuit evaluation as a language feature. In an expression like isAuthenticated() && hasPermission(), the runtime skips hasPermission() if isAuthenticated() returns false. You probably use this without thinking about it.

What fewer developers do is apply this as a conscious architectural principle. The idea is to structure your systems so that the cheap checks happen first and gate the expensive ones. Authentication before authorization. Authorization before database access. Schema validation before any business logic. Cache lookup before network call.

This sounds obvious when stated plainly. In practice, it requires discipline because the “natural” order of operations often follows the happy path rather than the cheap-first path. You think about what needs to happen when everything works, not about what needs to happen first to discover that you can skip the rest.

A well-designed API endpoint might reject ninety percent of invalid requests based on a header check that takes microseconds, before any SQL ever executes. That’s not just faster, it’s more secure, more resilient under load, and easier to reason about.

Memoization Versus Never Computing

Memoization is a genuine optimization: compute something once, cache the result, return the cached version on subsequent calls. It’s useful and worth knowing. But it’s a consolation prize compared to not needing the computation at all.

The pattern I see repeatedly in large codebases is that memoization gets applied to functions that shouldn’t exist in their current form. The underlying question, “what does this caller actually need?” was never asked. Instead, an expensive function was written, it showed up in profiling, and someone added a cache.

Before you memoize, ask whether the computation is actually necessary given what the caller does with the result. It’s surprisingly common to find that callers use only a small portion of an expensive result set, and a targeted query or a leaner interface eliminates the need to cache anything.

Avoiding Work at the System Level

The principle scales beyond individual functions. At the system level, avoiding unnecessary work looks like conditional deployment (run this service only in regions where it’s needed), feature flags that prevent code from executing for users who won’t encounter the relevant feature, and request coalescing (where multiple identical downstream requests get collapsed into one).

Content delivery networks are a good example of this thinking applied at scale. A CDN doesn’t make your origin server faster. It makes your origin server irrelevant for the majority of requests. The computation (serving the file) still happens, but it happens once, and every subsequent request skips it entirely.

This is also why deleting a feature is often the hardest call in engineering. The code that does the most to improve performance is frequently the code you remove, not the code you optimize. Fewer features mean fewer code paths, fewer edge cases to handle, fewer opportunities for something expensive to execute when it shouldn’t.

The Counterargument

The obvious pushback is that premature optimization is the root of all evil, to quote Knuth’s famous observation, and that avoiding work before you know what work is actually slow is just speculation dressed up as discipline.

This is a fair point badly applied. Knuth was warning against micro-optimizations that sacrifice clarity before profiling identifies a real bottleneck. He wasn’t arguing against thinking clearly about what your code needs to do before you write it. The two things are different.

Avoiding unnecessary computation isn’t premature optimization. It’s requirements clarity. If a piece of code will run in a context where its result will be discarded, that’s not a performance concern, it’s a design concern. Catching it before writing the code isn’t premature, it’s appropriate.

The discipline also doesn’t require predicting the future. It requires asking a simple question during design: “Under what conditions is this computation actually needed?” If the answer isn’t “always,” then the code should reflect that.

The Real Performance Question

Performance culture in software tends to reward sophistication: clever algorithms, tight loops, careful memory management. These things matter, and understanding them is part of being good at this work. But the mental model that produces the biggest gains is a simpler one.

Before you ask how to run code faster, ask whether you need to run it at all. The answer, more often than you’d expect, is that you don’t. And no amount of optimization beats zero work.

The Fastest Code Is the Code You Never Run

The Hidden Cost of Doing the Work

Short-Circuit Logic as a Design Philosophy

Memoization Versus Never Computing

Avoiding Work at the System Level

The Counterargument

The Real Performance Question

You might also like

Why Longer System Prompts Usually Make LLMs Worse

Your LLM Prompt Isn't Being Read the Way You Think

The Smarter Autocomplete Gets, the Worse You Write

Softmax Is Doing Something Subtle Most Tutorials Skip

The Bug That Only Appears When Nobody Is Watching

Context Windows Are Bigger and Dumber Than You Think

Stay ahead of the curve.

The Hidden Cost of Doing the Work

Short-Circuit Logic as a Design Philosophy

Memoization Versus Never Computing

Avoiding Work at the System Level

The Counterargument

The Real Performance Question

Don't miss the signal.

You might also like

Why Longer System Prompts Usually Make LLMs Worse

Your LLM Prompt Isn't Being Read the Way You Think

The Smarter Autocomplete Gets, the Worse You Write

Softmax Is Doing Something Subtle Most Tutorials Skip

The Bug That Only Appears When Nobody Is Watching

Context Windows Are Bigger and Dumber Than You Think

Stay ahead of the curve.