When are static and global variables initialized?

Zachary picture Zachary · Jul 22, 2013 · Viewed 43.1k times · Source

In C++ I know static and global objects are constructed before the main function. But as you know, in C, there is no such kind initialization procedure before main.

For example, in my code:

int global_int1 = 5;
int global_int2;
static int static_int1 = 4;
static int static_int2;
  • When are these four variables initialized?
  • Where values for initialization like 5 and 4 are stored during compilation? How to manage them when initialization?

EDIT:
Clarification of 2nd question.

  • In my code I use 5 to initialize global_int1, so how can the compiler assign 5 to global_int? For example, maybe the compiler first store the 5 value at somewhere (i.e. a table), and get this value when initialization begins.
  • As to "How to manage them when initialization?", it is realy vague and I myself does not how to interpret yet. Sometimes, it is not easy to explain a question. Overlook it since I have not mastered the question fully yet.

Answer

James Kanze picture James Kanze · Jul 22, 2013

By static and global objects, I presume you mean objects with static lifetime defined at namespace scope. When such objects are defined with local scope, the rules are slightly different.

Formally, C++ initializes such variables in three phases: 1. Zero initialization 2. Static initialization 3. Dynamic initialization The language also distinguishes between variables which require dynamic initialization, and those which require static initialization: all static objects (objects with static lifetime) are first zero initialized, then objects with static initialization are initialized, and then dynamic initialization occurs.

As a simple first approximation, dynamic initialization means that some code must be executed; typically, static initialization doesn't. Thus:

extern int f();

int g1 = 42;    //  static initialization
int g2 = f();   //  dynamic initialization

Another approximization would be that static initialization is what C supports (for variables with static lifetime), dynamic everything else.

How the compiler does this depends, of course, on the initialization, but on disk based systems, where the executable is loaded into memory from disk, the values for static initialization are part of the image on disk, and loaded directly by the system from the disk. On a classical Unix system, global variables would be divided into three "segments":

text:
The code, loaded into a write protected area. Static variables with `const` types would also be placed here.
data:
Static variables with static initializers.
bss:
Static variables with no-initializer (C and C++) or with dynamic initialization (C++). The executable contains no image for this segment, and the system simply sets it all to `0` before starting your code.

I suspect that a lot of modern systems still use something similar.

EDIT:

One additional remark: the above refers to C++03. For existing programs, C++11 probably doesn't change anything, but it does add constexpr (which means that some user defined functions can still be static initialization) and thread local variables, which opens up a whole new can of worms.