I'm involved in one of those challenges where you try to produce the smallest possible binary, so I'm building my program without the C or C++ run-time libraries (RTL). I don't link to the DLL version or the static version. I don't even #include
the header files. I have this working fine.
Some RTL functions, like memset()
, can be useful, so I tried adding my own implementation. It works fine in Debug builds (even for those places where the compiler generates an implicit call to memset()
). But in Release builds, I get an error saying that I cannot define an intrinsic function. You see, in Release builds, intrinsic functions are enabled, and memset()
is an intrinsic.
I would love to use the intrinsic for memset()
in my release builds, since it's probably inlined and smaller and faster than my implementation. But I seem to be a in catch-22. If I don't define memset()
, the linker complains that it's undefined. If I do define it, the compiler complains that I cannot define an intrinsic function.
Does anyone know the right combination of definition, declaration, #pragma
, and compiler and linker flags to get an intrinsic function without pulling in RTL overhead?
Visual Studio 2008, x86, Windows XP+.
To make the problem a little more concrete:
extern "C" void * __cdecl memset(void *, int, size_t);
#ifdef IMPLEMENT_MEMSET
void * __cdecl memset(void *pTarget, int value, size_t cbTarget) {
char *p = reinterpret_cast<char *>(pTarget);
while (cbTarget > 0) {
*p++ = static_cast<char>(value);
--cbTarget;
}
return pTarget;
}
#endif
struct MyStruct {
int foo[10];
int bar;
};
int main() {
MyStruct blah;
memset(&blah, 0, sizeof(blah));
return blah.bar;
}
And I build like this:
cl /c /W4 /WX /GL /Ob2 /Oi /Oy /Gs- /GF /Gy intrinsic.cpp
link /SUBSYSTEM:CONSOLE /LTCG /DEBUG /NODEFAULTLIB /ENTRY:main intrinsic.obj
If I compile with my implementation of memset()
, I get a compiler error:
error C2169: 'memset' : intrinsic function, cannot be defined
If I compile this without my implementation of memset()
, I get a linker error:
error LNK2001: unresolved external symbol _memset
I think I finally found a solution:
First, in a header file, declare memset()
with a pragma, like so:
extern "C" void * __cdecl memset(void *, int, size_t);
#pragma intrinsic(memset)
That allows your code to call memset()
. In most cases, the compiler will inline the intrinsic version.
Second, in a separate implementation file, provide an implementation. The trick to preventing the compiler from complaining about re-defining an intrinsic function is to use another pragma first. Like this:
#pragma function(memset)
void * __cdecl memset(void *pTarget, int value, size_t cbTarget) {
unsigned char *p = static_cast<unsigned char *>(pTarget);
while (cbTarget-- > 0) {
*p++ = static_cast<unsigned char>(value);
}
return pTarget;
}
This provides an implementation for those cases where the optimizer decides not to use the intrinsic version.
The outstanding drawback is that you have to disable whole-program optimization (/GL and /LTCG). I'm not sure why. If someone finds a way to do this without disabling global optimization, please chime in.