Is it a good idea to vectorize the code? What are good practices in terms of when to do it? What happens underneath?
Vectorization means that the compiler detects that your independent instructions can be executed as one SIMD instruction. Usual example is that if you do something like
for(i=0; i<N; i++){
a[i] = a[i] + b[i];
}
It will be vectorized as (using vector notation)
for (i=0; i<(N-N%VF); i+=VF){
a[i:i+VF] = a[i:i+VF] + b[i:i+VF];
}
Basically the compiler picks one operation that can be done on VF elements of the array at the same time and does this N/VF times instead of doing the single operation N times.
It increases performance, but puts more requirement on the architecture.