What happens to global and static variables in a shared library when it is dynamically linked?

Raja picture Raja · Oct 15, 2013 · Viewed 93.3k times · Source

I'm trying to understand what happens when modules with globals and static variables are dynamically linked to an application. By modules, I mean each project in a solution (I work a lot with visual studio!). These modules are either built into *.lib or *.dll or the *.exe itself.

I understand that the binary of an application contains global and static data of all the individual translation units (object files) in the data segment (and read only data segment if const).

  • What happens when this application uses a module A with load-time dynamic linking? I assume the DLL has a section for its globals and statics. Does the operating system load them? If so, where do they get loaded to?

  • And what happens when the application uses a module B with run-time dynamic linking?

  • If I have two modules in my application that both use A and B, are copies of A and B's globals created as mentioned below (if they are different processes)?

  • Do DLLs A and B get access to the applications globals?

(Please state your reasons as well)

Quoting from MSDN:

Variables that are declared as global in a DLL source code file are treated as global variables by the compiler and linker, but each process that loads a given DLL gets its own copy of that DLL's global variables. The scope of static variables is limited to the block in which the static variables are declared. As a result, each process has its own instance of the DLL global and static variables by default.

and from here:

When dynamically linking modules, it can be unclear whether different libraries have their own instances of globals or whether the globals are shared.

Thanks.

Answer

Mikael Persson picture Mikael Persson · Oct 15, 2013

This is a pretty famous difference between Windows and Unix-like systems.

No matter what:

  • Each process has its own address space, meaning that there is never any memory being shared between processes (unless you use some inter-process communication library or extensions).
  • The One Definition Rule (ODR) still applies, meaning that you can only have one definition of the global variable visible at link-time (static or dynamic linking).

So, the key issue here is really visibility.

In all cases, static global variables (or functions) are never visible from outside a module (dll/so or executable). The C++ standard requires that these have internal linkage, meaning that they are not visible outside the translation unit (which becomes an object file) in which they are defined. So, that settles that issue.

Where it gets complicated is when you have extern global variables. Here, Windows and Unix-like systems are completely different.

In the case of Windows (.exe and .dll), the extern global variables are not part of the exported symbols. In other words, different modules are in no way aware of global variables defined in other modules. This means that you will get linker errors if you try, for example, to create an executable that is supposed to use an extern variable defined in a DLL, because this is not allowed. You would need to provide an object file (or static library) with a definition of that extern variable and link it statically with both the executable and the DLL, resulting in two distinct global variables (one belonging to the executable and one belonging to the DLL).

To actually export a global variable in Windows, you have to use a syntax similar to the function export/import syntax, i.e.:

#ifdef COMPILING_THE_DLL
#define MY_DLL_EXPORT extern "C" __declspec(dllexport)
#else
#define MY_DLL_EXPORT extern "C" __declspec(dllimport)
#endif

MY_DLL_EXPORT int my_global;

When you do that, the global variable is added to the list of exported symbols and can be linked like all the other functions.

In the case of Unix-like environments (like Linux), the dynamic libraries, called "shared objects" with extension .so export all extern global variables (or functions). In this case, if you do load-time linking from anywhere to a shared object file, then the global variables are shared, i.e., linked together as one. Basically, Unix-like systems are designed to make it so that there is virtually no difference between linking with a static or a dynamic library. Again, ODR applies across the board: an extern global variable will be shared across modules, meaning that it should have only one definition across all the modules loaded.

Finally, in both cases, for Windows or Unix-like systems, you can do run-time linking of the dynamic library, i.e., using either LoadLibrary() / GetProcAddress() / FreeLibrary() or dlopen() / dlsym() / dlclose(). In that case, you have to manually get a pointer to each of the symbols you wish to use, and that includes the global variables you wish to use. For global variables, you can use GetProcAddress() or dlsym() just the same as you do for functions, provided that the global variables are part of the exported symbol list (by the rules of the previous paragraphs).

And of course, as a necessary final note: global variables should be avoided. And I believe that the text you quoted (about things being "unclear") is referring exactly to the platform-specific differences that I just explained (dynamic libraries are not really defined by the C++ standard, this is platform-specific territory, meaning it is much less reliable / portable).