How can I exchange the low 128 bits and high 128 bits in a 256 bit AVX (YMM) register

Mark Borgerding picture Mark Borgerding · Aug 26, 2011 · Viewed 8.7k times · Source

I am porting SSE SIMD code to use the 256 bit AVX extensions and cannot seem to find any instruction that will blend/shuffle/move the high 128 bits and the low 128 bits.

The backing story:

What I really want is VHADDPS/_mm256_hadd_ps to act like HADDPS/_mm_hadd_ps, only with 256 bit words. Unfortunately, it acts like two calls to HADDPS acting independently on the low and high words.

Answer

Mark Borgerding picture Mark Borgerding · Aug 28, 2011

Using VPERM2F128, one can swap the low 128 and high 128 bits ( as well as other permutations). The instrinsic function usage looks like

x = _mm256_permute2f128_ps( x , x , 1)

The third argument is a control word which gives the user a lot of flexibility. See the Intel Instrinsic Guide for details.