I recently noticed that
_m128 m = _mm_set_ps(0,1,2,3);
puts the 4 floats into reverse order when cast to a float array:
(float*) p = (float*)(&m);
// p[0] == 3
// p[1] == 2
// p[2] == 1
// p[3] == 0
The same happens with a union { _m128 m; float[4] a; }
also.
Why do SSE operations use this ordering? It's not a big deal but slightly confusing.
And a follow-up question:
When accessing elements in the array by index, should one access in the order 0..3
or the order 3..0
?
Depend on what you would like to do, you can use either _mm_set_ps or _mm_setr_ps.
__m128 _mm_setr_ps (float z, float y, float x, float w ) Sets the four SP FP values to the four inputs in reverse order.