Speed up float 5x5 matrix * vector multiplication with SSE

c++ vectorization matrix-multiplication sse simd

Enzo · Jul 8, 2011 · Viewed 9.5k times · Source

I need to run a matrix-vector multiplication 240000 times per second. The matrix is 5x5 and is always the same, whereas the vector changes at each iteration. The data type is float. I was thinking of using some SSE (or similar) instructions.

I am concerned that the number of arithmetic operations is too small compared to the number of memory operations involved. Do you think I can get some tangible (e.g. > 20%) improvement?
Do I need the Intel compiler to do it?
Can you point out some references?

Answer

The Eigen C++ template library for vectors, matrices, ... has both

optimised code for small fixed size matrices (as well as dynamically sized ones)
optimised code that uses SSE optimisations

so you should give it a try.

Speed up float 5x5 matrix * vector multiplication with SSE

Answer

Related questions