Maximum blocks per grid:CUDA

smilingbuddha picture smilingbuddha · May 18, 2011 · Viewed 29.5k times · Source

What is the maximum number of blocks in a grid that can created per kernel launch? I am slightly confused here since

Now the compute capability table here says that there can be 65535 blocks per grid dimemsion in CUDA compute capability 2.0.

Does that mean the total number of blocks = 65535*65535?

Or does it mean that you can rearrange at most 65535 into a 1d grid of 65536 blocks or 2d grid of sqrt(65535) * sqrt(65535) ?

Thank you.

Answer

talonmies picture talonmies · May 18, 2011

65535 per dimension of the grid. On compute 1.x cards, 1D and 2D grids are supported. On compute 2.x cards, 3D grids are also supported, so 65535, 65535 x 65535, and 65535 x 65535 x 65535 are the limits for Fermi (compute 2.x) cards.