Parallel algorithm to compute Prefix Sum

user1134599 picture user1134599 · Apr 8, 2012 · Viewed 15.9k times · Source

I am having a problem with implementing the algorithm for computing a prefix sum in parallel. Even though this algorithm has 3 steps, I am unable to write the code, as no pseudo-code is given.

I went through various material on the web and also on stack overflow, but I didn't get the exact implementation of the algorithm as given on the wiki. The wiki mentions the following:

A prefix sum can be calculated in parallel by the following steps::

  1. Compute the sums of consecutive pairs of items in which the first item of the pair has an even index: z0 = x0 + x1, z1 = x2 + x3, etc.
  2. Recursively compute the prefix sum w0, w1, w2, ... of the sequence z0, z1, z2, ...
  3. Expand each term of the sequence w0, w1, w2, ... into two terms of the overall prefix sum: y0 = x0, y1 = w0, y2 = w0 + x2, y3 = w1, etc. After the first value, each successive number yi is either copied from a position half as far through the w sequence, or is the previous value added to one value in the x sequence

Can someone please suggest a pseudo-code implementation for me to check and implement?

Answer

Marcel Valdez Orozco picture Marcel Valdez Orozco · Oct 13, 2012

I believe the provided answer is not enough to understand the algorithm, so I provided an actual answer with more comprehensive pseudocode here: https://stackoverflow.com/a/12874227/697862 based on Parallel Prefix Sum (Scan) with CUDA which is a complete article describing the optimum parallel algorithm according to:

Blelloch, Guy E. 1990. "Prefix Sums and Their Applications." Technical Report CMU-CS-90-190, School of Computer Science, Carnegie Mellon University.