Question

Why `+ 0.1f` Can Be Faster Than `+ 0` in C++: Compiler Optimization and Floating-Point Behavior

cppperformancevisual-studio-2010compilationfloating-point

Question

In Visual Studio 2010 SP1, why does the following C++ code run more than 10 times faster when it uses 0.1f and - 0.1f than when it uses + 0 and - 0, even though the two versions look almost identical?

const float x[16] = {
    1.1f, 1.2f, 1.3f, 1.4f,
    1.5f, 1.6f, 1.7f, 1.8f,
    1.9f, 2.0f, 2.1f, 2.2f,
    2.3f, 2.4f, 2.5f, 2.6f
};

const float z[16] = {
    1.123f, 1.234f, 1.345f, 156.467f,
    1.578f, 1.689f, 1.790f, 1.812f,
    1.923f, 2.034f, 2.145f, 2.256f,
    2.367f, 2.478f, 2.589f, 2.690f
};

float y[16];
for (int i = 0; i < 16; i++)
{
    y[i] = x[i];
}

for (int j = 0; j < 9000000; j++)
{
    for (int i = 0; i < 16; i++)
    {
        y[i] *= x[i];
        y[i] /= z[i];
        y[i] = y[i] + 0.1f;
        y[i] = y[i] - 0.1f;
    }
}

But this version is much slower:

const float x[16] = {
    1.1f, 1.2f, 1.3f, 1.4f,
    1.5f, 1.6f, 1.7f, 1.8f,
    1.9f, 2.0f, 2.1f, 2.2f,
    2.3f, 2.4f, 2.5f, 2.6f
};

const float z[16] = {
    1.123f, 1.234f, 1.345f, 156.467f,
    1.578f, 1.689f, 1.790f, 1.812f,
    1.923f, 2.034f, 2.145f, 2.256f,
    2.367f, 2.478f, 2.589f, 2.690f
};

float y[16];
for (int i = 0; i < 16; i++)
{
    y[i] = x[i];
}

for (int j = 0; j < 9000000; j++)
{
    for (int i = 0; i < 16; i++)
    {
        y[i] *= x[i];
        y[i] /= z[i];
        y[i] = y[i] + 0;
        y[i] = y[i] - 0;
    }
}

The code was compiled with optimization level /O2 and SSE2 enabled. The key question is: how can changing a value from 0.1f to 0 make the program much slower instead of faster?

Short Answer

By the end of this page, you will understand that performance differences like this often come from compiler optimizations, not just the raw cost of arithmetic operations. You will learn how compilers remove useless work, how floating-point expressions can change optimization opportunities, why benchmarks can accidentally measure optimized-away code, and how to write more reliable performance tests in C++.

Concept

In C++, the code you write is not necessarily the code the CPU executes. An optimizing compiler analyzes your program and may:

remove calculations that do not affect the final result
rearrange operations
keep values in registers instead of memory
vectorize loops using SIMD instructions such as SSE2
simplify expressions like x + 0 into just x

That sounds like + 0 should always be faster. But in benchmarks, the opposite can happen if simplification changes the compiler's strategy.

The core idea: the compiler optimizes the whole expression

These two lines:

y[i] = y[i] + 0;
y[i] = y[i] - 0;

are mathematically redundant. A compiler can often remove them completely.

But once they disappear, the loop becomes:

y[i] *= x[i];
y[i] /= z[i];

Now the compiler may generate a very different machine-code version of the loop. In particular:

it may stop using a vectorized path it used before
it may keep a strict dependency chain from one iteration to the next
it may expose the real cost of floating-point division
it may decide the loop's result is never used in a meaningful way and optimize differently

With 0.1f, the compiler cannot simply remove both operations in the same way, because floating-point arithmetic is not exact. In floating-point math:

Mental Model

Think of the compiler like a very aggressive assistant who tries to simplify your work before doing it.

If you write take a number, add zero, then subtract zero, the assistant says: “That changes nothing. I’ll remove those steps.”
If you write add 0.1f, then subtract 0.1f, the assistant says: “In floating-point math, those may not cancel exactly, so I should be careful.”

Now imagine two factory workflows:

In one workflow, the assistant removes some stations, which accidentally causes the remaining process to be organized less efficiently.
In the other workflow, the extra stations stay, but the whole line can be grouped and processed in batches.

Even though the second workflow looks like more work, the whole system can end up faster.

That is what often happens with optimized C++: a tiny source change can push the compiler into a completely different machine-code strategy.

Take Quiz

Syntax and Examples

Redundant expressions in C++

A compiler can often simplify arithmetic identities:

float a = value + 0.0f; // often simplified to value
float b = value * 1.0f; // often simplified to value

But with floating-point code, not every expression is safely removable under strict rules.

Example 1: integer arithmetic

int x = 10;
x = x + 0;
x = x - 0;

For integers, this is always equivalent to:

int x = 10;

The compiler can remove the extra operations.

Example 2: floating-point arithmetic

float x = 1.2345f;
x = x + 0.1f;
x = x - 0.1f;

This looks like it should do nothing, but floating-point numbers are stored with limited precision. So the intermediate result may be rounded.

That means this sequence is not always exactly the same as the original value.

Example 3: benchmarking trap

Step by Step Execution

Consider this smaller example:

float y = 2.0f;
float x = 3.0f;
float z = 4.0f;

y *= x;
y /= z;
y = y + 0;
y = y - 0;

Step-by-step at the source level

y starts as 2.0f
y *= x makes y = 6.0f
y /= z makes y = 1.5f
y = y + 0 keeps y = 1.5f
y = y - 0 keeps y = 1.5f

So logically, the last two lines do nothing.

What the compiler may do

The compiler may transform the whole sequence into:

float y = (2.0f * 3.0f) / 4.0f;

or even directly into a constant if everything is known in advance.

Real World Use Cases

This topic appears in real software whenever developers optimize numeric code or write benchmarks.

1. Game engines

Game loops perform many floating-point calculations for:

physics
animation
particle systems
camera movement

A small source-level change can affect SIMD optimization and frame time.

2. Image and signal processing

Operations on arrays of floats are common in:

audio filters
image transforms
video processing

Compilers often try to vectorize these loops. Tiny changes can enable or block that optimization.

3. Scientific and financial software

Floating-point precision rules matter because mathematically identical expressions may not be computationally identical.

Examples:

repeated rounding in simulations
numerical stability in matrix code
price calculations using the wrong numeric type

4. Benchmarking libraries and performance tests

Developers often create small benchmarks to compare algorithms. If the compiler removes or rewrites the work, the benchmark stops measuring what the developer intended.

5. Embedded and high-performance systems

When CPU time matters, teams inspect generated assembly or profiling results to verify whether the compiler produced:

scalar code
vectorized code
fused operations
redundant loads and stores

Take Quiz

Real Codebase Usage

In real projects, developers rarely rely on guessing what is faster. Instead, they use patterns that make optimization safer and performance measurements more trustworthy.

Common patterns

1. Make benchmark results observable

If a computed value is never used, the compiler may remove the computation.

float result = compute();
std::cout << result << '\n';

Or store it somewhere visible to the program.

2. Separate algorithm code from benchmark code

A common structure is:

float run_algorithm(const float* x, const float* z, float* y)
{
    for (int i = 0; i < 16; ++i)
    {
        y[i] *= x[i];
        y[i] /= z[i];
    }

    float sum = 0.0f;
    for (int i = 0; i < 16; ++i)
    {
        sum += y[i];
    }
    return sum;
}

Returning a value makes the work harder to discard.

3. Use profiling and assembly inspection

Developers often verify performance with:

Common Mistakes

1. Benchmarking code whose result is never used

If the final value is ignored, the compiler may remove most or all of the loop.

Problematic example

for (int i = 0; i < 1000000; ++i)
{
    value *= 1.01f;
}

If value is never observed, this may not measure real work.

Better

for (int i = 0; i < 1000000; ++i)
{
    value *= 1.01f;
}

std::cout << value << '\n';

2. Assuming mathematically equivalent floating-point expressions are identical

This is false in many cases.

Example

float a = 0.3f;
float b = (a + 0.1f) - 0.1f;

b may not be bit-for-bit equal to a.

3. Assuming fewer source lines always means faster code

The compiler optimizes whole expressions. Removing one operation in source code can accidentally produce worse machine code.

Comparisons

Related ideas compared

Concept	What it means	Effect on this example
`+ 0` / `- 0`	Identity operations	Usually removable by the compiler
`+ 0.1f` / `- 0.1f`	Not exact inverses in floating-point	Often must be preserved under strict rules
Integer arithmetic	Exact for these identities	Easier for compiler to simplify
Floating-point arithmetic	Rounded, limited precision	Harder to prove expressions are equivalent
Scalar code	One value at a time	May be slower for array-heavy loops
SIMD/vectorized code	Multiple values processed at once	Can make a huge speed difference

Cheat Sheet

Quick reference

C++ compilers optimize expressions, not just individual lines.
x + 0 and x - 0 are usually removable.
(x + 0.1f) - 0.1f is not guaranteed to equal x exactly in floating-point math.
Floating-point arithmetic uses rounding, so algebraic identities may not hold exactly.
Tiny source changes can trigger major changes in:
- vectorization
- register allocation
- instruction scheduling
- loop elimination
Microbenchmarks are unreliable if the result is never used.

Safer benchmark checklist

// 1. Use Release mode
// 2. Ensure result is observable
// 3. Repeat enough times
// 4. Measure with a real timer/profiler
// 5. Inspect generated assembly if needed

Common identities the compiler may remove

x + 0
x - 0
x * 1
x / 1

But be careful with floating-point

These are not always safely interchangeable under strict semantics:

(x + a) - 
(x * a) / a

FAQ

Why would `+ 0` make code slower instead of faster?

Because + 0 may let the compiler generate a completely different optimized version of the loop. The slowdown usually comes from changed optimization strategy, not from the cost of adding zero.

Is `+ 0` always removed by the compiler in C++?

Usually yes, especially for simple expressions. But exact behavior depends on type, compiler, optimization level, and surrounding code.

Why doesn't `+ 0.1f` and `- 0.1f` always cancel out?

Because floating-point values are rounded to limited precision. After + 0.1f, the result may not be represented exactly, so subtracting 0.1f may not restore the original bit pattern.

Does this happen only in Visual Studio 2010?

No. The general idea applies to many compilers, though the exact performance difference depends on compiler version, optimization settings, and CPU architecture.

Is floating-point division expensive?

Yes. Division is usually much slower than addition or multiplication, and compiler choices around division-heavy loops can have a big impact on speed.

How can I benchmark this kind of code correctly?

Use a release build, make sure the result is used, run enough iterations, and inspect generated assembly or use a profiler if the result looks suspicious.

Should I use `volatile` for benchmarking?

Related Concepts

Compiler optimization — This is the main concept behind why small code changes can produce large speed differences.
Floating-point precision — Important because + 0.1f and - 0.1f are not exact inverses in binary floating-point.
Dead code elimination — Relevant because compilers may remove work that has no observable effect.
Loop vectorization — Important for understanding why array-processing loops can speed up or slow down dramatically.
SIMD and SSE2 — Directly related because the code was compiled with SSE2 enabled.
Microbenchmarking — Useful because this example is really about measuring performance correctly.
Instruction scheduling — Relevant because the compiler may reorder operations for better CPU throughput.
Release vs Debug builds — Important because optimization differences are only meaningful in optimized builds.
Integer vs floating-point arithmetic — Related because exact algebraic simplifications are much safer for integers than floats.

Take Quiz

Mini Project

Description

Build a small C++ benchmark that demonstrates how compiler optimizations can change the performance of a numeric loop. The project compares a loop that adds and subtracts zero with one that adds and subtracts 0.1f, then prints a checksum so the compiler cannot easily discard the work.

Goal

Create and run a benchmark that shows how small floating-point expression changes can affect optimization and execution speed.

Requirements

Write a C++ program with two benchmark functions: one using + 0.0f and - 0.0f, and one using + 0.1f and - 0.1f
Use arrays of float values and repeat the inner loop many times
Return or print a checksum from each benchmark so the calculations remain observable
Measure execution time using std::chrono
Compile in Release mode with optimizations enabled

Take Quiz

Keep learning

Expression	Literal type	Typical meaning
`0`	`int`	Integer zero, converted if needed
`0.0`	`double`	Double-precision floating-point zero
`0.0f`	`float`	Single-precision floating-point zero

Mode	Compiler freedom	Tradeoff
Strict / precise	Preserves floating-point semantics carefully	More predictable results, sometimes less optimization
Fast math / relaxed rules	Allows algebraic simplifications	Faster code, but results may differ slightly

Why `+ 0.1f` Can Be Faster Than `+ 0` in C++: Compiler Optimization and Floating-Point Behavior

Question

Short Answer

Concept

The core idea: the compiler optimizes the whole expression

Mental Model

Syntax and Examples

Redundant expressions in C++

Example 1: integer arithmetic

Example 2: floating-point arithmetic

Example 3: benchmarking trap

Step by Step Execution

Step-by-step at the source level

What the compiler may do

Real World Use Cases

1. Game engines

2. Image and signal processing

3. Scientific and financial software

4. Benchmarking libraries and performance tests

5. Embedded and high-performance systems

Real Codebase Usage

Common patterns

1. Make benchmark results observable

2. Separate algorithm code from benchmark code

3. Use profiling and assembly inspection

Common Mistakes

1. Benchmarking code whose result is never used

Problematic example

Better

2. Assuming mathematically equivalent floating-point expressions are identical

Example

3. Assuming fewer source lines always means faster code

Comparisons

Related ideas compared

Cheat Sheet

Quick reference

Safer benchmark checklist

Common identities the compiler may remove

But be careful with floating-point

FAQ

Why would + 0 make code slower instead of faster?

Is + 0 always removed by the compiler in C++?

Why doesn't + 0.1f and - 0.1f always cancel out?

Does this happen only in Visual Studio 2010?

Is floating-point division expensive?

How can I benchmark this kind of code correctly?

Should I use volatile for benchmarking?

Related Concepts

Mini Project

Description

Goal

Requirements

Related questions

Basic Rules and Idioms for Operator Overloading in C++

C++ Casts Explained: C-Style Cast vs static_cast vs dynamic_cast

C++ Lambda Expressions Explained: What They Are and When to Use Them

Why this matters

Important takeaway

Step-by-step at the floating-point level

Why that affects speed

4. Be careful with floating-point assumptions

5. Prefer clear code first

Guard clause for performance investigations

4. Ignoring compiler version differences

5. Forgetting that division is expensive

6. Mixing integer and floating-point literals carelessly

Clearer version

0 vs 0.0f

Strict floating-point vs aggressive floating-point optimization

Practical rule

What is the main lesson from this example?

Why would `+ 0` make code slower instead of faster?

Is `+ 0` always removed by the compiler in C++?

Why doesn't `+ 0.1f` and `- 0.1f` always cancel out?

Should I use `volatile` for benchmarking?

`0` vs `0.0f`