allocating shared memory

lina picture lina · Apr 3, 2011 · Viewed 42k times · Source

i am trying to allocate shared memory by using a constant parameter but getting an error. my kernel looks like this:

__global__ void Kernel(const int count)
{
    __shared__ int a[count];
}

and i am getting an error saying

error: expression must have a constant value

count is const! Why am I getting this error? And how can I get around this?

Answer

talonmies picture talonmies · Apr 3, 2011

CUDA supports dynamic shared memory allocation. If you define the kernel like this:

__global__ void Kernel(const int count)
{
    extern __shared__ int a[];
}

and then pass the number of bytes required as the the third argument of the kernel launch

Kernel<<< gridDim, blockDim, a_size >>>(count)

then it can be sized at run time. Be aware that the runtime only supports a single dynamically declared allocation per block. If you need more, you will need to use pointers to offsets within that single allocation. Also be aware when using pointers that shared memory uses 32 bit words, and all allocations must be 32 bit word aligned, irrespective of the type of the shared memory allocation.