Top "Cuda" questions

CUDA (Compute Unified Device Architecture) is a parallel computing platform and programming model for NVIDIA GPUs (Graphics Processing Units).

nvidia-smi Failed to initialize NVML: GPU access blocked by the operating system

when asking for nvidia-smi it gives this error: Failed to initialize NVML: GPU access blocked by the operating system other …

cuda gpu nvidia
CUDA: How to use -arch and -code and SM vs COMPUTE

I am still not sure how to properly specify the architectures for code generation when building with nvcc. I am …

cuda nvcc ptx fat-binaries
Can/Should I run this code on a GPU?

I'm working on a statistical application containing approximately 10 - 30 million floating point values in an array. Several methods performing different, …

c++ c cuda parallel-processing gpu
When is CUDA's __shared__ memory useful?

Can someone please help me with a very simple example on how to use shared memory? The example included in …

c cuda gpu
Allocate 2D Array on Device Memory in CUDA

How do I allocate and transfer(to and from Host) 2D arrays in device memory in Cuda?

multidimensional-array memory-management 2d cuda device
Block reduction in CUDA

I am trying to do reduction in CUDA and I am really a newbie. I am currently studying a sample …

algorithm cuda reduction cub
What can I do against 'CUDA driver version is insufficient for CUDA runtime version'?

When I go to /usr/local/cuda/samples/1_Utilities/deviceQuery and execute moose@pc09 /usr/local/cuda/samples/1_Utilities/deviceQuery $ …

cuda ubuntu-14.04 nvidia linux-mint
Double precision floating point in CUDA

Does CUDA support double precision floating point numbers? Also, what are the reasons for the same?

floating-point cuda gpu gpgpu
CUDA allocate memory in __device__ function

Is there a way in CUDA to allocate memory dynamically in device-side functions ? I could not find any examples of …

memory-management cuda dynamic-memory-allocation
CUDA apps time out & fail after several seconds - how to work around this?

I've noticed that CUDA applications tend to have a rough maximum run-time of 5-15 seconds before they will fail and …

cuda timeout gpgpu gpu-programming