How to get a "random" number in OpenCL

user886910 picture user886910 · Mar 28, 2012 · Viewed 22k times · Source

I'm looking to get a random number in OpenCL. It doesn't have to be real random or even that random. Just something simple and quick.

I see there is a ton of real random parallelized fancy pants random algorithms in OpenCL that are like thousand and thousands of lines. I do NOT need anything like that. A simple 'random()' would be fine, even if it is easy to see patterns in it.

I see there is a Noise function? Any easy way to use that to get a random number?

Answer

icl7126 picture icl7126 · Apr 21, 2013

I was solving this "no random" issue for last few days and I came up with three different approaches:

  1. Xorshift - I created generator based on this one. All you have to do is provide one uint2 number (seed) for whole kernel and every work item will compute his own rand number

    // 'randoms' is uint2 passed to kernel
    uint seed = randoms.x + globalID;
    uint t = seed ^ (seed << 11);  
    uint result = randoms.y ^ (randoms.y >> 19) ^ (t ^ (t >> 8));
    
  2. Java random - I used code from .next(int bits) method to generate random number. This time you have to provide one ulong number as seed.

    // 'randoms' is ulong passed to kernel
    ulong seed = randoms + globalID;
    seed = (seed * 0x5DEECE66DL + 0xBL) & ((1L << 48) - 1);
    uint result = seed >> 16;
    
  3. Just generate all on CPU and pass it to kernel in one big buffer.

I tested all three approaches (generators) in my evolution algorithm computing Minimum Dominating Set in graphs.

I like the generated numbers from the first one, but it looks like my evolution algorithm doesn't.

Second generator generates numbers that has some visible pattern but my evolution algorithm likes it that way anyway and whole thing run little faster than with the first generator.

But the third approach shows that it's absolutely fine to just provide all numbers from host (cpu). First I though that generating (in my case) 1536 int32 numbers and passing them to GPU in every kernel call would be too expensive (to compute and transfer to GPU). But it turns out, it is as fast as my previous attempts. And CPU load stays under 5%.

BTW, I also tried MWC64X Random but after I installed new GPU driver the function mul_hi starts causing build fail (even whole AMD Kernel Analyer crashed).