How is Meyers' implementation of a Singleton actually a Singleton

lbrendanl picture lbrendanl · Jul 18, 2013 · Viewed 38.8k times · Source

I have been reading a lot about Singletons, when they should and shouldn't be used, and how to implement them safely. I am writing in C++11, and have come across the Meyer's lazy initialized implementation of a singleton, as seen in this question.

This implementation is:

static Singleton& instance()
{
     static Singleton s;
     return s;
}

I understand how this is thread safe from other questions here on SO, but what I don't understand is how this is actually a singleton pattern. I have implemented singletons in other languages, and these always end up something like this example from Wikipedia:

public class SingletonDemo {
        private static volatile SingletonDemo instance = null;

        private SingletonDemo() {       }

        public static SingletonDemo getInstance() {
                if (instance == null) {
                        synchronized (SingletonDemo .class){
                                if (instance == null) {
                                        instance = new SingletonDemo ();
                                }
                      }
                }
                return instance;
        }
}

When I look at this second example, it is very intuitive how this is a singleton, since the class holds a reference to one instance of itself, and only ever returns that instance. However, in the first example, I don't understand how this prevents there ever existing two instances of the object. So my questions are:

  1. How does the first implementation enforce a singleton pattern? I assume it has to do with the static keyword, but I am hoping that someone can explain to me in depth what is happening under the hood.
  2. Between these two implementation styles, is one preferable over the other? What are the pros and cons?

Thanks for any help,

Answer

Sean Middleditch picture Sean Middleditch · Jul 18, 2013

This is a singleton because static storage duration for a function local means that only one instance of that local exists in the program.

Under the hood, this can very roughly be considered to be equivalent to the following C++98 (and might even be implemented vaguely like this by a compiler):

static bool __guard = false;
static char __storage[sizeof(Singleton)]; // also align it

Singleton& Instance() {
  if (!__guard ) {
    __guard = true;
    new (__storage) Singleton();
  }
  return *reinterpret_cast<Singleton*>(__storage);
}

// called automatically when the process exits
void __destruct() {
  if (__guard)
    reinterpret_cast<Singleton*>(__storage)->~Singleton();
}

The thread safety bits make it get a bit more complicated, but it's essentially the same thing.

Looking at an actual implementation for C++11, there is a guard variable for each static (like the boolean above), which is also used for barriers and threads. Look at Clang's AMD64 output for:

Singleton& instance() {
   static Singleton instance;
   return instance;
}

The AMD64 assembly for instance from Ubuntu's Clang 3.0 on AMD64 at -O1 (courtesy of http://gcc.godbolt.org/ is:

instance():                           # @instance()
  pushq %rbp
  movq  %rsp, %rbp
  movb  guard variable for instance()::instance(%rip), %al
  testb %al, %al
  jne   .LBB0_3
  movl  guard variable for instance()::instance, %edi
  callq __cxa_guard_acquire
  testl %eax, %eax
  je    .LBB0_3
  movl  instance()::instance, %edi
  callq Singleton::Singleton()
  movl  guard variable for instance()::instance, %edi
  callq __cxa_guard_release
.LBB0_3:
  movl  instance()::instance, %eax
  popq  %rbp
  ret

You can see that it references a global guard to see if initialization is required, uses __cxa_guard_acquire, tests the initialization again, and so on. Exactly in almost every way like version you posted from Wikipedia, except using AMD64 assembly and the symbols/layout specified in the Itanium ABI.

Note that if you run that test you should give Singleton a non-trivial constructor so it's not a POD, otherwise the optimizer will realize that there's no point to doing all that guard/locking work.