SIMD/Operations: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
Line 5: Line 5:
To assemble vectors of data that subsequently can be operated upon, SIMD instruction sets include loading instructions, that copy data values from consecutive memory locations into SIMD registers. After completing computation, the contents of SIMD registers can be copied to memory locations using store instructions.
To assemble vectors of data that subsequently can be operated upon, SIMD instruction sets include loading instructions, that copy data values from consecutive memory locations into SIMD registers. After completing computation, the contents of SIMD registers can be copied to memory locations using store instructions.


In JavaScript, these instructions need to be exposed to programmers in a convenient way to instantiate SIMD data types, such as, e.g., [[SIMD/Types/uint16x8 |uint16x8]]. This can, for instance happen by having a constructor that accepts 8 numeric values (and applying clamping etc. as needed):
In JavaScript, these instructions need to be exposed to programmers in a convenient way to instantiate SIMD data types, such as, e.g., [[SIMD/Types/uint16x8 |uint16x8]]. This can, for instance, happen by having a constructor that accepts 8 numeric values (and applying clamping etc. as needed):


  var myUint16x8 = new uint16x8(1, 2, 3, 4, 5, 6, 8);
  var myUint16x8 = new uint16x8(1, 2, 3, 4, 5, 6, 8);


This has the distinct disadvantage of being slow, as each data value need to be converted to the corresponding scalar data type and written into memory to be accessible for the SIMD load instruction.
This has the distinct disadvantage of being slow, as each data value needs to be converted to the corresponding scalar data type and written into memory to be accessible for the SIMD load instruction.


A more efficient approach may be to load data values from a fittingly typed [https://developer.mozilla.org/en-US/docs/Web/API/ArrayBufferView ArrayBufferView], which is backed by a memory region containing data values in a linear fashion, as desired by the SIMD load/store instructions:
A more efficient approach may be to load data values from a fittingly typed [https://developer.mozilla.org/en-US/docs/Web/API/ArrayBufferView ArrayBufferView], which is backed by a memory region containing data values in a linear fashion, as desired by the SIMD load/store instructions:

Revision as of 17:23, 2 May 2014

To make use of SIMD data, operations are needed that work on the corresponding data types.

Load and Store

To assemble vectors of data that subsequently can be operated upon, SIMD instruction sets include loading instructions, that copy data values from consecutive memory locations into SIMD registers. After completing computation, the contents of SIMD registers can be copied to memory locations using store instructions.

In JavaScript, these instructions need to be exposed to programmers in a convenient way to instantiate SIMD data types, such as, e.g., uint16x8. This can, for instance, happen by having a constructor that accepts 8 numeric values (and applying clamping etc. as needed):

var myUint16x8 = new uint16x8(1, 2, 3, 4, 5, 6, 8);

This has the distinct disadvantage of being slow, as each data value needs to be converted to the corresponding scalar data type and written into memory to be accessible for the SIMD load instruction.

A more efficient approach may be to load data values from a fittingly typed ArrayBufferView, which is backed by a memory region containing data values in a linear fashion, as desired by the SIMD load/store instructions:

// myUint16Array is an ArrayBufferView of type Uint16Array
var myUint16x8 = new uint16x8(myUint16Array, 2);

This loads eight uint16 values from myUint16Array into myUint16x8, starting at offset 2.

Arithmetic operations

Clearly, following basic operations are needed:

  • Addition, optionally saturating
  • Subtraction, optionally saturating
  • Multiplication

Division on modern CPUs can be achieved much faster using reciprocal multiplication. However, for ease of programming it might still make sense to offer a direct division operation (if the divisor can be detected to be static, the JIT engine may still convert it to reciprocal multiplication for full speed).

For video codecs, having support for the SAD operation on integer vector types can be very useful.

Min, Max (clamping) is useful for every data type.

Averaging. SSE2 only seems to have operations for averaging unsigned 8 and 16 bit integers, which may indicate what's considered useful.

Where applicable, all operations should have the option to supply a scalar argument, which is automatically expanded to vector form and applied.

Logical operations

Shift operations

Pack and Unpack

Comparisons

Data type conversion