User:Bjacob/ArithmeticTimingDifferences: Difference between revisions

Jump to navigation Jump to search
Line 105: Line 105:
An example is a [http://people.mozilla.org/~bjacob/ta/results/intel-celeron-1.8GHz-family-15-model-2-stepping-7-toshiba-satellite/ Pentium-4-era Celeron] exhibiting NaN-related slow behavior with SSE2 instructions, specifically on floating-point multiplications where one of the operands is NaN. See the alert about float multiplication and about double multiplication in [http://people.mozilla.org/~bjacob/ta/results/intel-celeron-1.8GHz-family-15-model-2-stepping-7-toshiba-satellite/time-math-x86-32-sse2-DAZ-FTZ.txt this log].
An example is a [http://people.mozilla.org/~bjacob/ta/results/intel-celeron-1.8GHz-family-15-model-2-stepping-7-toshiba-satellite/ Pentium-4-era Celeron] exhibiting NaN-related slow behavior with SSE2 instructions, specifically on floating-point multiplications where one of the operands is NaN. See the alert about float multiplication and about double multiplication in [http://people.mozilla.org/~bjacob/ta/results/intel-celeron-1.8GHz-family-15-model-2-stepping-7-toshiba-satellite/time-math-x86-32-sse2-DAZ-FTZ.txt this log].


=== Some recent AMD CPUs have abnormally ''fast'' comparisons with NaN values, even with SSE2 instructions ===
=== Some Intel CPUs have abnormally slow comparisons with NaN and denormals, even with SSE2, DAZ and FTZ ===


This is another way in why the notion that NaN issues are all solved by switching to SSE2, is not quite true.
This is another way in why the notion that NaN issues are all solved by switching to SSE2, is not quite true.


A [http://people.mozilla.org/~bjacob/ta/results/amd-fx-8150-family-21-model-1-stepping-2-benwa-bulldozer/ 8-core AMD FX processor] exhibits abnormally ''fast'' single-precision floating-point equality comparisons (ucomiss / ucomisd instructions) when one of the operands is NaN. See the alerts about float equality comparison in [http://people.mozilla.org/~bjacob/ta/results/amd-fx-8150-family-21-model-1-stepping-2-benwa-bulldozer/time-math-x86-64-DAZ-FTZ.txt this log].
The SSE instructions being used there are ucomiss and ucomisd.


The SSE instructions being used there are ucomiss and ucomisd.
A [http://people.mozilla.org/~bjacob/ta/results/intel-core-2-2.66GHz-family-6-model-15-stepping-11-st3fan/ Intel Core 2] exhibits abnormally ''slow'' floating-point equality comparisons, both with floats and doubles, whenever NaN or denormals are involved --- even with SSE2, DAZ and FTZ. See the alerts about float and double equality comparison in [http://people.mozilla.org/~bjacob/ta/results/intel-core-2-2.66GHz-family-6-model-15-stepping-11-st3fan/time-math-x86-64-DAZ-FTZ.txt this log].
 
Timing differences in equality comparisons are worrisome because equality comparison is what we would naturally use if we wanted to manually avoid NaN values. So having this not run in constant time means that the idea of manually avoiding specific "bad" values may be difficult.
 
=== Some recent AMD CPUs have abnormally ''fast'' comparisons with NaN, even with SSE2 instructions ===
 
This one is also about equality comparisons, but in a different way. A [http://people.mozilla.org/~bjacob/ta/results/amd-fx-8150-family-21-model-1-stepping-2-benwa-bulldozer/ 8-core AMD FX processor] exhibits abnormally ''fast'' single-precision floating-point equality comparisons (ucomiss / ucomisd instructions) when one of the operands is NaN. See the alerts about float equality comparison in [http://people.mozilla.org/~bjacob/ta/results/amd-fx-8150-family-21-model-1-stepping-2-benwa-bulldozer/time-math-x86-64-DAZ-FTZ.txt this log].


This is particularly worrying because:
This is its own kind of worrying because:
* This shows that we have to watch out not just for abnormally ''slow'' but also for abnormally ''fast'' operations.
* This shows that we have to watch out not just for abnormally ''slow'' but also for abnormally ''fast'' operations.
* This show that Intel and AMD are not making completely parallel progress on these issues.
* This show that Intel and AMD are not making completely parallel progress on these issues.
* Equality comparison is what we would use if we wanted to manually avoid NaN values. So having this not run in constant time means that the idea of manually avoiding specific "bad" values may not be feasible.


=== A significant mass of SSE2-capable CPUs do not have DAZ ===
=== A significant mass of SSE2-capable CPUs do not have DAZ ===
Confirmed users
753

edits

Navigation menu