Does gcc have memory alignment pragma, akin #pragma vector aligned
in Intel compiler?
I would like to tell compiler to optimize particular loop using aligned loads/store instructions. to avoid possible confusion, this is not about struct packing.
e.g:
#if defined (__INTEL_COMPILER)
#pragma vector aligned
#endif
for (int a = 0; a < int(N); ++a) {
q10 += Ix(a,0,0)*Iy(a,1,1)*Iz(a,0,0);
q11 += Ix(a,0,0)*Iy(a,0,1)*Iz(a,1,0);
q12 += Ix(a,0,0)*Iy(a,0,0)*Iz(a,0,1);
q13 += Ix(a,1,0)*Iy(a,0,0)*Iz(a,0,1);
q14 += Ix(a,0,0)*Iy(a,1,0)*Iz(a,0,1);
q15 += Ix(a,0,0)*Iy(a,0,0)*Iz(a,1,1);
}
Thanks
You can tell GCC that a pointer points to aligned memory by using a typedef to create an over-aligned type that you can declare pointers to.
This helps gcc but not clang7.0 or ICC19, see the x86-64 non-AVX asm they emit on Godbolt. (Only GCC folds a load into a memory operand for mulps
, instead of using a separate movups
). You have have to use __builtin_assume_aligned
if you want to portably convey an alignment promise to GNU C compilers other than GCC itself.
From http://gcc.gnu.org/onlinedocs/gcc/Type-Attributes.html
typedef double aligned_double __attribute__((aligned (16)));
// Note: sizeof(aligned_double) is 8, not 16
void some_function(aligned_double *x, aligned_double *y, int n)
{
for (int i = 0; i < n; ++i) {
// math!
}
}
This won't make aligned_double
16 bytes wide. This will just make it aligned to a 16-byte boundary, or rather the first one in an array will be. Looking at the disassembly on my computer, as soon as I use the alignment directive, I start to see a LOT of vector ops. I am using a Power architecture computer at the moment so it's altivec code, but I think this does what you want.
(Note: I wasn't using double
when I tested this, because there altivec doesn't support double floats.)
You can see some other examples of autovectorization using the type attributes here: http://gcc.gnu.org/projects/tree-ssa/vectorization.html