I recently noticed that
_m128 m = _mm_set_ps(0,1,2,3);
puts the 4 floats into reverse order when cast to a float array:
(float*) p = (float*)(&m);
// p[0] == 3
// p[1] == 2
// p[2] == 1
// p[3] == 0
The same happens with a union { _m128 m; float[4] a; }
Why do SSE operations use this ordering? It's not a big deal but slightly confusing.
And a follow-up question:
When accessing elements in the array by index, should one access in the order 0..3
or the order 3..0
Depend on what you would like to do, you can use either _mm_set_ps or _mm_setr_ps.
__m128 _mm_setr_ps (float z, float y, float x, float w ) Sets the four SP FP values to the four inputs in reverse order.