|
|
| Line 132: |
Line 132: |
| * [http://people.mozilla.org/~bjacob/ta/results/intel-pentium-d-3GHz-family-15-model-4-stepping-7-dell-dimension9150-gavin/time-math-x86-64-DAZ-FTZ.txt this Pentium D], family 15 model 4. This set of results shows that the problem exists even in 64bit mode. | | * [http://people.mozilla.org/~bjacob/ta/results/intel-pentium-d-3GHz-family-15-model-4-stepping-7-dell-dimension9150-gavin/time-math-x86-64-DAZ-FTZ.txt this Pentium D], family 15 model 4. This set of results shows that the problem exists even in 64bit mode. |
|
| |
|
| === Some Intel CPUs have abnormally slow comparisons with NaN and denormals, even with SSE2, DAZ and FTZ === | | === A significant mass of SSE2-capable CPUs do not have DAZ === |
|
| |
|
| This is another way in which the notion that NaN issues are all solved by switching to SSE2, is not quite true.
| | Conventional "wisdom" is that DAZ is available nearly everywhere SSE2 is. |
|
| |
|
| The SSE instructions being used there are ucomiss and ucomisd.
| | This turns out not to be the case even for Intel CPUs: this [http://people.mozilla.org/~bjacob/ta/results/intel-pentium-m-1.4GHz-family-6-model-9-stepping-5-toshiba-tecra/ Pentium M] supports SSE2 but evidently [http://people.mozilla.org/~bjacob/ta/results/intel-pentium-m-1.4GHz-family-6-model-9-stepping-5-toshiba-tecra/time-math-x86-32-sse2-DAZ-FTZ.txt doesn't support DAZ]. As a result, timings are all over the place, even on just additions, as soon as denormals are involved. |
|
| |
|
| A [http://people.mozilla.org/~bjacob/ta/results/intel-core-2-2.66GHz-family-6-model-15-stepping-11-st3fan/ Intel Core 2] exhibits abnormally ''slow'' floating-point equality comparisons, both with floats and doubles, whenever NaN or denormals are involved --- even with SSE2, DAZ and FTZ. See the alerts about float and double equality comparison in [http://people.mozilla.org/~bjacob/ta/results/intel-core-2-2.66GHz-family-6-model-15-stepping-11-st3fan/time-math-x86-64-DAZ-FTZ.txt this log].
| | It is also reported that some early Pentium 4 CPUs were in the same case. Conversely, not ''all'' Pentium M's are affected (here is an [http://people.mozilla.org/~bjacob/ta/results/intel-pentium-m-1.6GHz-family-6-model-13-stepping-6-dell-inspiron600m-gavin/ unaffected one]). |
|
| |
|
| Timing differences in equality comparisons are worrisome because equality comparison is what we would naturally use if we wanted to manually avoid NaN values. So having this not run in constant time means that the idea of manually avoiding specific "bad" values may be difficult. | | === Timing differences in equality comparisons === |
|
| |
|
| === Some AMD CPUs have abnormally ''fast'' comparisons with NaN, even with SSE2 instructions ===
| | The SSE instructions being used there are ucomiss and ucomisd. |
|
| |
|
| This one is also about equality comparisons, but in a different way. A [http://people.mozilla.org/~bjacob/ta/results/amd-fx-8150-family-21-model-1-stepping-2-benwa-bulldozer/ 8-core AMD FX processor] exhibits abnormally ''fast'' single-precision floating-point equality comparisons (ucomiss / ucomisd instructions) when one of the operands is NaN. See the alerts about float equality comparison in [http://people.mozilla.org/~bjacob/ta/results/amd-fx-8150-family-21-model-1-stepping-2-benwa-bulldozer/time-math-x86-64-DAZ-FTZ.txt this log].
| | The first problem is denormals: equality comparisons are just as affected as other operations by denormals. |
|
| |
|
| This is its own kind of worrying because:
| | The next problem is that many CPUs have abnormally ''slow'' or ''fast'' equality comparisons when the operands are non-finite. |
| * This shows that we have to watch out not just for abnormally ''slow'' but also for abnormally ''fast'' operations.
| |
| * This show that Intel and AMD are not making completely parallel progress on these issues.
| |
|
| |
|
| === A significant mass of SSE2-capable CPUs do not have DAZ ===
| | Specifically: |
| | | * Recent Intel CPUs tend to have abnormally slow comparisons with NaN. This includes this Core i7 this Core 2. |
| Conventional "wisdom" is that DAZ is available nearly everywhere SSE2 is.
| | * Recent AMD CPUs, older Intel CPUs, and Intel Atom CPUs have abnormally fast comparisons with NaN. Examples: Athlon FX, Athlon II, Pentium B940, Pentium 4, Pentium D, Intel Atom. |
| | | * Some VIA CPUs have abnormally fast comparisons with NaN and Inf. See this VIA Nano. |
| This turns out not to be the case even for Intel CPUs: this [http://people.mozilla.org/~bjacob/ta/results/intel-pentium-m-1.4GHz-family-6-model-9-stepping-5-toshiba-tecra/ Pentium M] supports SSE2 but evidently [http://people.mozilla.org/~bjacob/ta/results/intel-pentium-m-1.4GHz-family-6-model-9-stepping-5-toshiba-tecra/time-math-x86-32-sse2-DAZ-FTZ.txt doesn't support DAZ]. As a result, timings are all over the place, even on just additions, as soon as denormals are involved.
| |
| | |
| It is also reported that some early Pentium 4 CPUs were in the same case. Conversely, not ''all'' Pentium M's are affected (here is an [http://people.mozilla.org/~bjacob/ta/results/intel-pentium-m-1.6GHz-family-6-model-13-stepping-6-dell-inspiron600m-gavin/ unaffected one]).
| |
|
| |
|
| === Some CPUs have surprising timing differences on 64-bit integer arithmetic using 32-bit integer arithmetic instructions, when DAZ is enabled === | | === Some CPUs have surprising timing differences on 64-bit integer arithmetic using 32-bit integer arithmetic instructions, when DAZ is enabled === |