If I use nested parallel for loops like this:
#pragma omp parallel for schedule(dynamic,1)
for (int x = 0; x < x_max; ++x) {
#pragma omp parallel for schedule(dynamic,1)
for (int y = 0; y < y_max; ++y) {
//parallelize this code here
}
//IMPORTANT: no code in here
}
is this equivalent to:
for (int x = 0; x < x_max; ++x) {
#pragma omp parallel for schedule(dynamic,1)
for (int y = 0; y < y_max; ++y) {
//parallelize this code here
}
//IMPORTANT: no code in here
}
Is the outer parallel for doing anything other than creating a new task?
If your compiler supports OpenMP 3.0, you can use the collapse
clause:
#pragma omp parallel for schedule(dynamic,1) collapse(2)
for (int x = 0; x < x_max; ++x) {
for (int y = 0; y < y_max; ++y) {
//parallelize this code here
}
//IMPORTANT: no code in here
}
If it doesn't (e.g. only OpenMP 2.5 is supported), there is a simple workaround:
#pragma omp parallel for schedule(dynamic,1)
for (int xy = 0; xy < x_max*y_max; ++xy) {
int x = xy / y_max;
int y = xy % y_max;
//parallelize this code here
}
You can enable nested parallelism with omp_set_nested(1);
and your nested omp parallel for
code will work but that might not be the best idea.
By the way, why the dynamic scheduling? Is every loop iteration evaluated in non-constant time?