Unpredictable CUDNN_STATUS_NOT_INITIALIZED on Windows

Dims picture Dims · Jul 11, 2017 · Viewed 13.3k times · Source

I am running keras neural network training and prediction on GTX 1070 on Windows 10. Most times it is working, but from time to time it complains

E c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\cuda\cuda_dnn.cc:359] could not create cudnn handle: CUDNN_STATUS_NOT_INITIALIZED
E c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\cuda\cuda_dnn.cc:366] error retrieving driver version: Unimplemented: kernel reported driver version not implemented on Windows
E c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\cuda\cuda_dnn.cc:326] could not destroy cudnn handle: CUDNN_STATUS_BAD_PARAM
F c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\kernels\conv_ops.cc:659] Check failed: stream->parent()->GetConvolveAlgorithms(&algorithms)

It cannot be explained neither by literally error meaning nor by OOM error.

How to fix?

Answer

elf picture elf · Aug 1, 2017

Try limiting your gpu usage with set gpu option per_process_gpu_memory_fraction.

Fiddle around with it to see what works and what doesn't.

I recommend using .7 as a starting baseline.