"g++ not detected" while data set goes larger, is there any limit to matrix size in GPU?

seven7e picture seven7e · Nov 12, 2015 · Viewed 17.6k times · Source

I got this message in using Keras to train an RNN for language model with a big 3D tensor (generated from a text, one hot encoded, and results a shape of (165717, 25, 7631)):

WARNING (theano.configdefaults): g++ not detected ! Theano will be unable to 
execute optimized C-implementations (for both CPU and GPU) and will default to 
Python implementations. Performance will be severely degraded. To remove this 
warning, set Theano flags cxx to an empty string.
ERROR (theano.sandbox.cuda): nvcc compiler not found on $PATH. Check your nvcc 
installation and try again.

But everything goes well while I limit the size of data set into small. Thus I wonder that does Theano or CUDA limit the size of matrix?

Besides, do I have a better way to do one hot representation? I mean, in the large 3D tensor, most elements are 0 due to the one-hot representation. However, I didn't found a layer which accepts index representation of words.

Answer

user3598832 picture user3598832 · Oct 2, 2016
conda install mingw libpython

Make sure this is installed. Get this answer from another post, https://stackoverflow.com/a/31109547/3598832, which indicated from the manual.