Popular "simd" questions | Page 3

I need some clarification. I'm developing OpenCL on my laptop running a small nvidia GPU (310M). When I query the …

opencl nvidia simd

Why, at the lowest level of the hardware performing operations and the general underlying operations involved (i.e.: things general …

performance language-agnostic vectorization simd low-level

can anyone recommend portable SIMD library that provides a c/c++ API, works on Intel and AMD extensions and Visual …

c++ open-source cross-platform simd

I've got some code, originally given to me by someone working with MSVC, and I'm trying to get it to …

c++ clang sse simd intrinsics

Does anyone know an open-source C++ x86 SIMD intrinsics library? Intel supplies exactly what I need in their integrated performance …

c++ sse simd intrinsics

how to use the Multiply-Accumulate intrinsics provided by GCC? float32x4_t vmlaq_f32 (float32x4_t , float32x4_t , …

c arm simd intrinsics neon

A common operation I do in my program is scaling vectors by a scalar (V*s, e.g. [1,2,3,4]*2 == [2,4,6,8]). Is there …

c x86 sse simd

The Intel Advanced Vector Extensions (AVX) offers no dot product in the 256-bit version (YMM register) for double precision floating …

c++ performance simd avx

How to multiply four 32-bit integers by another 4 integers? I didn't find any instruction which can do it.

x86 sse simd multiplication sse2

I'm tried to improve performance of copy operation via SSE and AVX: #include <immintrin.h> const int sz = 1024; …

c++ performance sse simd avx

Top "Simd" questions