Question
Why `+ 0.1f` Can Be Faster Than `+ 0` in C++: Compiler Optimization and Floating-Point Behavior
Question
In Visual Studio 2010 SP1, why does the following C++ code run more than 10 times faster when it uses 0.1f and - 0.1f than when it uses + 0 and - 0, even though the two versions look almost identical?
const float x[16] = {
1.1f, 1.2f, 1.3f, 1.4f,
1.5f, 1.6f, 1.7f, 1.8f,
1.9f, 2.0f, 2.1f, 2.2f,
2.3f, 2.4f, 2.5f, 2.6f
};
const float z[16] = {
1.123f, 1.234f, 1.345f, 156.467f,
1.578f, 1.689f, 1.790f, 1.812f,
1.923f, 2.034f, 2.145f, 2.256f,
2.367f, 2.478f, 2.589f, 2.690f
};
float y[16];
for (int i = 0; i < 16; i++)
{
y[i] = x[i];
}
for (int j = 0; j < 9000000; j++)
{
for (int i = 0; i < 16; i++)
{
y[i] *= x[i];
y[i] /= z[i];
y[i] = y[i] + 0.1f;
y[i] = y[i] - 0.1f;
}
}
But this version is much slower:
const float x[16] = {
1.1f, 1.2f, 1.3f, 1.4f,
1.5f, 1.6f, 1.7f, 1.8f,
1.9f, 2.0f, 2.1f, 2.2f,
2.3f, 2.4f, 2.5f, 2.6f
};
const float z[16] = {
1.123f, 1.234f, 1.345f, 156.467f,
1.578f, 1.689f, 1.790f, 1.812f,
1.923f, 2.034f, 2.145f, 2.256f,
2.367f, 2.478f, 2.589f, 2.690f
};
float y[16];
for (int i = 0; i < 16; i++)
{
y[i] = x[i];
}
for (int j = 0; j < 9000000; j++)
{
for (int i = 0; i < 16; i++)
{
y[i] *= x[i];
y[i] /= z[i];
y[i] = y[i] + 0;
y[i] = y[i] - 0;
}
}
The code was compiled with optimization level /O2 and SSE2 enabled. The key question is: how can changing a value from 0.1f to 0 make the program much slower instead of faster?
Short Answer
By the end of this page, you will understand that performance differences like this often come from compiler optimizations, not just the raw cost of arithmetic operations. You will learn how compilers remove useless work, how floating-point expressions can change optimization opportunities, why benchmarks can accidentally measure optimized-away code, and how to write more reliable performance tests in C++.
Concept
In C++, the code you write is not necessarily the code the CPU executes. An optimizing compiler analyzes your program and may:
- remove calculations that do not affect the final result
- rearrange operations
- keep values in registers instead of memory
- vectorize loops using SIMD instructions such as SSE2
- simplify expressions like
x + 0into justx
That sounds like + 0 should always be faster. But in benchmarks, the opposite can happen if simplification changes the compiler's strategy.
The core idea: the compiler optimizes the whole expression
These two lines:
y[i] = y[i] + 0;
y[i] = y[i] - 0;
are mathematically redundant. A compiler can often remove them completely.
But once they disappear, the loop becomes:
y[i] *= x[i];
y[i] /= z[i];
Now the compiler may generate a very different machine-code version of the loop. In particular:
- it may stop using a vectorized path it used before
- it may keep a strict dependency chain from one iteration to the next
- it may expose the real cost of floating-point division
- it may decide the loop's result is never used in a meaningful way and optimize differently
With 0.1f, the compiler cannot simply remove both operations in the same way, because floating-point arithmetic is not exact. In floating-point math:
Mental Model
Think of the compiler like a very aggressive assistant who tries to simplify your work before doing it.
- If you write
take a number, add zero, then subtract zero, the assistant says: “That changes nothing. I’ll remove those steps.” - If you write
add 0.1f, then subtract 0.1f, the assistant says: “In floating-point math, those may not cancel exactly, so I should be careful.”
Now imagine two factory workflows:
- In one workflow, the assistant removes some stations, which accidentally causes the remaining process to be organized less efficiently.
- In the other workflow, the extra stations stay, but the whole line can be grouped and processed in batches.
Even though the second workflow looks like more work, the whole system can end up faster.
That is what often happens with optimized C++: a tiny source change can push the compiler into a completely different machine-code strategy.
Syntax and Examples
Redundant expressions in C++
A compiler can often simplify arithmetic identities:
float a = value + 0.0f; // often simplified to value
float b = value * 1.0f; // often simplified to value
But with floating-point code, not every expression is safely removable under strict rules.
Example 1: integer arithmetic
int x = 10;
x = x + 0;
x = x - 0;
For integers, this is always equivalent to:
int x = 10;
The compiler can remove the extra operations.
Example 2: floating-point arithmetic
float x = 1.2345f;
x = x + 0.1f;
x = x - 0.1f;
This looks like it should do nothing, but floating-point numbers are stored with limited precision. So the intermediate result may be rounded.
That means this sequence is not always exactly the same as the original value.
Example 3: benchmarking trap
Step by Step Execution
Consider this smaller example:
float y = 2.0f;
float x = 3.0f;
float z = 4.0f;
y *= x;
y /= z;
y = y + 0;
y = y - 0;
Step-by-step at the source level
ystarts as2.0fy *= xmakesy = 6.0fy /= zmakesy = 1.5fy = y + 0keepsy = 1.5fy = y - 0keepsy = 1.5f
So logically, the last two lines do nothing.
What the compiler may do
The compiler may transform the whole sequence into:
float y = (2.0f * 3.0f) / 4.0f;
or even directly into a constant if everything is known in advance.
Real World Use Cases
This topic appears in real software whenever developers optimize numeric code or write benchmarks.
1. Game engines
Game loops perform many floating-point calculations for:
- physics
- animation
- particle systems
- camera movement
A small source-level change can affect SIMD optimization and frame time.
2. Image and signal processing
Operations on arrays of floats are common in:
- audio filters
- image transforms
- video processing
Compilers often try to vectorize these loops. Tiny changes can enable or block that optimization.
3. Scientific and financial software
Floating-point precision rules matter because mathematically identical expressions may not be computationally identical.
Examples:
- repeated rounding in simulations
- numerical stability in matrix code
- price calculations using the wrong numeric type
4. Benchmarking libraries and performance tests
Developers often create small benchmarks to compare algorithms. If the compiler removes or rewrites the work, the benchmark stops measuring what the developer intended.
5. Embedded and high-performance systems
When CPU time matters, teams inspect generated assembly or profiling results to verify whether the compiler produced:
- scalar code
- vectorized code
- fused operations
- redundant loads and stores
Real Codebase Usage
In real projects, developers rarely rely on guessing what is faster. Instead, they use patterns that make optimization safer and performance measurements more trustworthy.
Common patterns
1. Make benchmark results observable
If a computed value is never used, the compiler may remove the computation.
float result = compute();
std::cout << result << '\n';
Or store it somewhere visible to the program.
2. Separate algorithm code from benchmark code
A common structure is:
float run_algorithm(const float* x, const float* z, float* y)
{
for (int i = 0; i < 16; ++i)
{
y[i] *= x[i];
y[i] /= z[i];
}
float sum = 0.0f;
for (int i = 0; i < 16; ++i)
{
sum += y[i];
}
return sum;
}
Returning a value makes the work harder to discard.
3. Use profiling and assembly inspection
Developers often verify performance with:
Common Mistakes
1. Benchmarking code whose result is never used
If the final value is ignored, the compiler may remove most or all of the loop.
Problematic example
for (int i = 0; i < 1000000; ++i)
{
value *= 1.01f;
}
If value is never observed, this may not measure real work.
Better
for (int i = 0; i < 1000000; ++i)
{
value *= 1.01f;
}
std::cout << value << '\n';
2. Assuming mathematically equivalent floating-point expressions are identical
This is false in many cases.
Example
float a = 0.3f;
float b = (a + 0.1f) - 0.1f;
b may not be bit-for-bit equal to a.
3. Assuming fewer source lines always means faster code
The compiler optimizes whole expressions. Removing one operation in source code can accidentally produce worse machine code.
Comparisons
Related ideas compared
| Concept | What it means | Effect on this example |
|---|---|---|
+ 0 / - 0 | Identity operations | Usually removable by the compiler |
+ 0.1f / - 0.1f | Not exact inverses in floating-point | Often must be preserved under strict rules |
| Integer arithmetic | Exact for these identities | Easier for compiler to simplify |
| Floating-point arithmetic | Rounded, limited precision | Harder to prove expressions are equivalent |
| Scalar code | One value at a time | May be slower for array-heavy loops |
| SIMD/vectorized code | Multiple values processed at once | Can make a huge speed difference |
Cheat Sheet
Quick reference
- C++ compilers optimize expressions, not just individual lines.
x + 0andx - 0are usually removable.(x + 0.1f) - 0.1fis not guaranteed to equalxexactly in floating-point math.- Floating-point arithmetic uses rounding, so algebraic identities may not hold exactly.
- Tiny source changes can trigger major changes in:
- vectorization
- register allocation
- instruction scheduling
- loop elimination
- Microbenchmarks are unreliable if the result is never used.
Safer benchmark checklist
// 1. Use Release mode
// 2. Ensure result is observable
// 3. Repeat enough times
// 4. Measure with a real timer/profiler
// 5. Inspect generated assembly if needed
Common identities the compiler may remove
x + 0
x - 0
x * 1
x / 1
But be careful with floating-point
These are not always safely interchangeable under strict semantics:
(x + a) -
(x * a) / a
FAQ
Why would + 0 make code slower instead of faster?
Because + 0 may let the compiler generate a completely different optimized version of the loop. The slowdown usually comes from changed optimization strategy, not from the cost of adding zero.
Is + 0 always removed by the compiler in C++?
Usually yes, especially for simple expressions. But exact behavior depends on type, compiler, optimization level, and surrounding code.
Why doesn't + 0.1f and - 0.1f always cancel out?
Because floating-point values are rounded to limited precision. After + 0.1f, the result may not be represented exactly, so subtracting 0.1f may not restore the original bit pattern.
Does this happen only in Visual Studio 2010?
No. The general idea applies to many compilers, though the exact performance difference depends on compiler version, optimization settings, and CPU architecture.
Is floating-point division expensive?
Yes. Division is usually much slower than addition or multiplication, and compiler choices around division-heavy loops can have a big impact on speed.
How can I benchmark this kind of code correctly?
Use a release build, make sure the result is used, run enough iterations, and inspect generated assembly or use a profiler if the result looks suspicious.
Should I use volatile for benchmarking?
Mini Project
Description
Build a small C++ benchmark that demonstrates how compiler optimizations can change the performance of a numeric loop. The project compares a loop that adds and subtracts zero with one that adds and subtracts 0.1f, then prints a checksum so the compiler cannot easily discard the work.
Goal
Create and run a benchmark that shows how small floating-point expression changes can affect optimization and execution speed.
Requirements
- Write a C++ program with two benchmark functions: one using
+ 0.0fand- 0.0f, and one using+ 0.1fand- 0.1f - Use arrays of
floatvalues and repeat the inner loop many times - Return or print a checksum from each benchmark so the calculations remain observable
- Measure execution time using
std::chrono - Compile in Release mode with optimizations enabled
Keep learning
Related questions
Basic Rules and Idioms for Operator Overloading in C++
Learn the core rules, syntax, and common idioms for operator overloading in C++, including member vs non-member operators.
C++ Casts Explained: C-Style Cast vs static_cast vs dynamic_cast
Learn the difference between C-style casts, static_cast, and dynamic_cast in C++ with clear examples, safety rules, and real usage tips.
C++ Lambda Expressions Explained: What They Are and When to Use Them
Learn what C++ lambda expressions are, why they exist, when to use them, and how they simplify callbacks, algorithms, and local logic.