Top "Fma" questions

Fused Multiply Add or Multiply-Accumulate

How to use Fused Multiply-Add (FMA) instructions with SSE/AVX

I have learned that some Intel/AMD CPUs can do simultanous multiply and add with SSE/AVX: FLOPS per cycle …

c sse cpu-architecture avx fma
FMA3 in GCC: how to enable

I have a i5-4250U which has AVX2 and FMA3. I am testing some dense matrix multiplication code in …

c++ gcc intel avx fma
Preventing GCC from automatically using AVX and FMA instructions when compiled with -mavx and -mfma

How can I disable auto-vectorization with AVX and FMA instructions? I would still prefer the compiler to employ SSE and …

c++ gcc vectorization avx fma
How to get data out of AVX registers?

Using MSVC 2013 and AVX 1, I've got 8 floats in a register: __m256 foo = mm256_fmadd_ps(a,b,c); Now I …

c++ visual-c++ avx fma
Obtaining peak bandwidth on Haswell in the L1 cache: only getting 62%

I'm attempting to obtain full bandwidth in the L1 cache for the following function on Intel processors float triad(float *…

c memory assembly nasm fma