I am studying OpenMP, and came across the following example:
#pragma omp parallel shared(n,a,b,c,d,sum) private(i)
{
#pragma omp for nowait
for (i=0; i<n; i++)
a[i] += b[i];
#pragma omp for nowait
for (i=0; i<n; i++)
c[i] += d[i];
#pragma omp barrier
#pragma omp for nowait reduction(+:sum)
for (i=0; i<n; i++)
sum += a[i] + c[i];
} /*-- End of parallel region --*/
In the last for loop, there is a nowait and a reduction clause. Is this correct? Doesn't the reduction clause need to be syncronized?
The nowait
s in the second and last loop are somewhat redundant. The OpenMP spec mentions nowait
before the end of the region so perhaps this can stay in.
But the nowait
before the second loop and the explicit barrier after it cancel each other out.
Lastly, about the shared
and private
clauses. In your code, shared
has no effect, and private
simply shouldn’t be used at all: If you need a thread-private variable, just declare it inside the parallel region. In particular, you should declare loop variables inside the loop, not before.
To make shared
useful, you need to tell OpenMP that it shouldn’t share anything by default. You should do this to avoid bugs due to accidentally shared variables. This is done by specifying default(none)
. This leaves us with:
#pragma omp parallel default(none) shared(n, a, b, c, d, sum)
{
#pragma omp for nowait
for (int i = 0; i < n; ++i)
a[i] += b[i];
#pragma omp for
for (int i = 0; i < n; ++i)
c[i] += d[i];
#pragma omp for nowait reduction(+:sum)
for (int i = 0; i < n; ++i)
sum += a[i] + c[i];
} // End of parallel region