OS: win10
VS: visual stadio2015 64bit
CUDA: CUDA8.0
python: python2.7.12 64bit (pycuda)
I followed this website, https://documen.tician.de/pycuda/tutorial.html#getting-started
import pycuda.driver as cuda
import pycuda.autoinit
from pycuda.compiler import SourceModule
import numpy
a = numpy.random.randn(4,4)
a = a.astype(numpy.float32)
a_gpu = cuda.mem_alloc(a.nbytes)
cuda.memcpy_htod(a_gpu,a)#transfer the data to the GPU
#executing a kernel
#function: write code to double each entry in a_gpu.
#we write the corresponding CUDA C code, and feed it into the constructor of pycuda.compiler.SourceModule
mod = SourceModule("""
__global__ void doublify(float *a)
{
int idx = threadIdx.x + threadIdx.y*4;
a[idx] *= 2;
}
""")
#If there aren’t any errors, the code is now compiled and loaded onto the device. We find a reference to our pycuda.driver.Function and call it, specifying a_gpu as the argument, and a block size of 4x4:
func = mod.get_function("doublify")
func(a_gpu, block=(4,4,1))
#Finally, we fetch the data back from the GPU and display it, together with the original a:
a_doubled = numpy.empty_like(a)
cuda.memcpy_dtoh(a_doubled, a_gpu)
print a_doubled
print a
but,failed with the error:
Traceback (most recent call last):
File "G:/myworkspace/python2.7/cuda/test.py", line 24, in <module>
""")
File "D:\python2.7\lib\site-packages\pycuda\compiler.py", line 265, in __init__
arch, code, cache_dir, include_dirs)
File "D:\python2.7\lib\site-packages\pycuda\compiler.py", line 255, in compile
return compile_plain(source, options, keep, nvcc, cache_dir, target)
File "D:\python2.7\lib\site-packages\pycuda\compiler.py", line 137, in compile_plain
stderr=stderr.decode("utf-8", "replace"))
CompileError: nvcc compilation of c:\users\gl\appdata\local\temp\tmp8poxqp\kernel.cu failed
[command: nvcc --cubin -arch sm_50 -m64 -Id:\python2.7\lib\site-packages\pycuda\cuda kernel.cu]
[stdout:
nvcc fatal : Cannot find compiler 'cl.exe' in PATH
]
Someone said to add the dir of cl.exe to environment. I did, and the error is the same. I'm new for CUDA. How could I solve this problem? Some advice?
I did as @citizenSNIPS adviced:
add the path to cl.exe, D:\vs2015\VC\bin.
INCLUDE = C:\Program Files (x86)\Windows Kits\10\Include\10.0.10240.0\ucrt.
LIB = C:\Program Files (x86)\Windows Kits\10\Lib\10.0.10240.0\ucrt\x64(I can't find C:\Program Files (x86)\Windows Kits\10\Lib\10.0.10240.0\um\x64 in my computer).
There had a new error as follow:
raceback (most recent call last):
File "G:\myworkspace\python2.7\cuda\test.py", line 24, in <module>
""")
File "D:\python2.7\lib\site-packages\pycuda\compiler.py", line 265, in __init__
arch, code, cache_dir, include_dirs)
File "D:\python2.7\lib\site-packages\pycuda\compiler.py", line 255, in compile
return compile_plain(source, options, keep, nvcc, cache_dir, target)
File "D:\python2.7\lib\site-packages\pycuda\compiler.py", line 147, in compile_plain
+ (stdout+stderr).decode("utf-8", "replace"), stacklevel=4)
File "D:\python2.7\lib\idlelib\run.py", line 36, in idle_showwarning_subproc
message, category, filename, lineno, line))
File "D:\python2.7\lib\idlelib\PyShell.py", line 65, in idle_formatwarning
s += "%s: %s\n" % (category.__name__, message)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 147-168: ordinal not in range(128)
now I'm working for this problem, maybe it's because I did not add C:\Program Files (x86)\Windows Kits\10\Lib\10.0.10240.0\um\x64?
you need to specify the path to cl.exe.
under system variables, find PATH, click edit, and add the path to cl.exe. it should be:
C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\
Make sure when you installed visual studios, you selected to install c++
compiler. it is not installed by default. if you didn't, re-run your visual studio installer and select to install the c++ compiler.
once you finish with that, you might need to add the following system variables
INCLUDE = C:\Program Files (x86)\Windows Kits\10\Include\10.0.10240.0\ucrt
LIB = C:\Program Files (x86)\Windows Kits\10\Lib\10.0.10240.0\um\x64
C:\Program Files (x86)\Windows Kits\10\Lib\10.0.10240.0\ucrt\x64
see this thread here