I am having a hard time compiling a simple cuda program consiting of only two files.
The main.c looks like this:
#include "my_cuda.h"
int main(int argc, char** argv){
dummy_gpu();
}
The cuda.h looks like this:
#ifndef MY_DUMMY
#define MY_DUMMY
void dummy_gpu();
#endif
And the my_cuda.cu file loos like this:
#include <cuda_runtime.h>
#include "my_cuda.h"
__global__ void dummy_gpu_kernel(){
//do something
}
void dummy_gpu(){
dummy_gpu_kernel<<<128,128>>>();
}
However if I compile I allways receive the following error:
gcc -I/usr/local/cuda/5.0.35/include/ -c main.c
nvcc -c my_cuda.cu
gcc -L/usr/local_rwth/sw/cuda/5.0.35/lib64 -lcuda -lcudart -o md.exe main.o my_cuda.o
main.o: In function `main':
main.c:(.text+0x15): undefined reference to `dummy_gpu'
collect2: ld returned 1 exit status
Thank you for your help.
You have a problem with symbol name mangling. nvcc
uses the host C++ compiler to compile host code, and this implies that symbol name mangling is applied to code emitted by the CUDA toolchain.
There are two solutions to this problem. The first is to define dummy_gpu
using C linkage, so change your my_cuda.cu
to something like this:
extern "C" {
#include "my_cuda.h"
}
.....
extern "C"
void dummy_gpu(){
dummy_gpu_kernel<<<128,128>>>();
}
Note that you will need to change your linkage command to this:
gcc -L/usr/local_rwth/sw/cuda/5.0.35/lib64 -o md.exe main.o my_cuda.o -lcuda -lcudart
because the CUDA shared libraries need to be specified after the object files that use them.
Your second alternative would be to use either g++
or nvcc
to do the linking, in which case the whole problem should disappear.