Your Compiler Rewrites Your Code and Hides the Evidence

When you write code, you are not writing instructions for a computer. You are writing a specification for a compiler, and the compiler will produce whatever machine code it believes satisfies that specification, regardless of what you intended. The gap between those two things is where bugs live, where security vulnerabilities hide, and where performance assumptions quietly crumble.

This is not a complaint about compilers. It is an argument that most developers misunderstand the contract they have signed with the tools they use every day.

The Optimizer Removes Code You Were Counting On

Consider a function that zeros out a buffer containing a password before freeing memory. You write the loop. You compile it. The compiler notices that the buffer is never read after the zeroing loop runs, concludes the loop has no observable effect on the program’s output, and removes it entirely. This is not a bug in the compiler. It is a correct application of the “as-if” rule: the compiler is permitted to transform code in any way that produces the same observable behavior. The problem is that “observable behavior” is defined from the program’s perspective, not an attacker’s.

This specific issue, called dead store elimination, has caused real security vulnerabilities. The OpenSSL project has historically used explicit memset calls to clear sensitive memory, only to find compilers optimizing them away. Microsoft’s Security Development Lifecycle guidelines now include compiler flags specifically to disable this optimization for security-sensitive code. The compiler was not wrong. The developer’s mental model was.

Undefined Behavior Gives the Compiler Permission to Do Anything

C and C++ have a category of operations called undefined behavior. Most developers interpret “undefined” to mean “unpredictable” or “implementation-specific.” The actual meaning is stronger: when undefined behavior occurs, the compiler may assume it never does, and reason about your entire program on that basis.

Signed integer overflow in C is undefined behavior. A compiler that sees code like if (x + 1 > x) and knows x is a signed integer will often compile this to true unconditionally, because the compiler assumes overflow cannot occur. The branch you wrote to catch an edge case simply vanishes. The Linux kernel maintainers deal with this class of problem regularly, which is why the kernel is compiled with -fno-strict-overflow.

The deeper issue is that undefined behavior is not localized. It can cause a compiler to make inferences that change code far from the original problematic line. Linus Torvalds has described this, bluntly, as compilers being “actively trying to remove security checks” even when that characterization is technically unfair to the compiler’s logic.

Diagram showing the visible source code above and the hidden compiled program below, with significant structural differences — The program you read in a code review and the program the CPU executes are related, but not identical.

Memory Model Gaps Turn Correct-Looking Code Concurrent

Multi-threaded code is the clearest example of the gap between what you write and what executes. Modern CPUs reorder memory operations for performance. Compilers do the same. The C11 and C++11 memory models gave developers a formal way to reason about this, but most developers still write concurrent code as if memory operations happen in source order.

The result is that two threads accessing shared data without proper synchronization primitives may observe memory writes in different orders, and the compiler has no obligation to tell you this is happening. The data race is undefined behavior, which again hands the compiler permission to reason about your program in ways that break your assumptions.

Java’s memory model is better specified, but the history of volatile in Java is instructive: for years, developers used it to approximate the guarantees it provided in C, without understanding that it offered no ordering guarantees around compound operations. The word looked familiar. The semantics were different.

The Counterargument

The standard defense of aggressive compiler optimization is that it produces faster, smaller programs that run correctly on the vast majority of hardware. This is true. Compilers like LLVM and GCC produce code that, in practice, outperforms what most developers could write by hand. The optimizations that cause problems are also the optimizations that made modern software fast enough to be useful at scale.

There is also a reasonable argument that the problem is education, not compilers. If developers understood the as-if rule, undefined behavior, and memory models, they would write code that does not rely on the assumptions that get violated. The tools are documented. The standards exist.

This argument is correct but insufficient. The asymmetry between what code looks like and what it does is not obvious. It requires expertise that is rarely taught, is not surfaced by most tooling, and becomes more dangerous as codebases grow larger and the distance between the person who wrote the code and the person maintaining it widens. Expecting every developer to hold the C++ memory model in their head while debugging a concurrency issue is a systems design failure, not an individual skill gap. Tools like AddressSanitizer, UBSan, and ThreadSanitizer exist precisely because the normal compilation path cannot be trusted to surface these issues.

The Program You Shipped Is Not the Program You Wrote

This matters practically. Security-sensitive code needs compiler flags that disable specific optimizations. Concurrent code needs explicit memory barriers, not just intuition. Performance benchmarks need to account for the fact that your benchmarks may not reflect production behavior under real conditions.

The compiler is a translator, and like any translator, it is not neutral. It makes choices, and those choices are governed by rules you agreed to implicitly when you picked a language. Understanding those rules is not optional for anyone writing code that matters. The contract was always there. Most developers just never read it.