Top "Simd" questions

Single instruction, multiple data (SIMD) is the concept of having each instruction operate on a small chunk or vector of data elements.

SIMD math libraries for SSE and AVX

I am looking for SIMD math libraries (preferably open source) for SSE and AVX. I mean for example if I …

sse simd avx math.h
Using __m256d registers

How do you use __m256d? Say I want to use the Intel AVX instruction _mm256_add_pd on a …

c++ x86 intel simd avx
How can I disable vectorization while using GCC?

I am compiling my code using following command: gcc -O3 -ftree-vectorizer-verbose=6 -msse4.1 -ffast-math With this all the optimizations are enabled. …

gcc vectorization sse simd auto-vectorization
How fast can you make linear search?

I'm looking to optimize this linear search: static int linear (const int *arr, int n, int key) { int i = 0; while (…

c search optimization simd linear-search
AVX2 what is the most efficient way to pack left based on a mask?

If you have an input array, and an output array, but you only want to write those elements which pass …

c++ vectorization sse simd avx2
SIMD and difference between packed and scalar double precision

I am reading Intel's intrinsics guide while implementing SIMD support. I have a few confusions and my questions are as …

c++ x86 sse simd intrinsics
Reference manual/tutorial for SIMD intrinsics?

I'm looking into using these to improve the performance of some code but good documentation seems hard to find for …

simd intrinsics
Push XMM register to the stack

Is there a way of pushing a packed doubleword integer from XMM register to the stack? and then later on …

assembly x86 simd sse
How to choose AVX compare predicate variants

In the Advanced Vector Extensions (AVX) the compare instructions like _m256_cmp_ps, the last argument is a compare predicate. …

simd avx
Speed up float 5x5 matrix * vector multiplication with SSE

I need to run a matrix-vector multiplication 240000 times per second. The matrix is 5x5 and is always the same, whereas …

c++ vectorization matrix-multiplication sse simd