The Java tutorials say that creating a Thread is expensive. But why exactly is it expensive? What exactly is happening when a Java Thread is created that makes its creation expensive? I'm taking the statement as true, but I'm just interested in mechanics of Thread creation in JVM.
Thread lifecycle overhead. Thread creation and teardown are not free. The actual overhead varies across platforms, but thread creation takes time, introducing latency into request processing, and requires some processing activity by the JVM and OS. If requests are frequent and lightweight, as in most server applications, creating a new thread for each request can consume significant computing resources.
From Java Concurrency in Practice
By Brian Goetz, Tim Peierls, Joshua Bloch, Joseph Bowbeer, David Holmes, Doug Lea
Print ISBN-10: 0-321-34960-1
Java thread creation is expensive because there is a fair bit of work involved:
It is also expensive in the sense that the thread ties down resources as long as it is alive; e.g. the thread stack, any objects reachable from the stack, the JVM thread descriptors, the OS native thread descriptors.
The costs of all of these things are platform specific, but they are not cheap on any Java platform I've ever come across.
A Google search found me an old benchmark that reports a thread creation rate of ~4000 per second on a Sun Java 1.4.1 on a 2002 vintage dual processor Xeon running 2002 vintage Linux. A more modern platform will give better numbers ... and I can't comment on the methodology ... but at least it gives a ballpark for how expensive thread creation is likely to be.
Peter Lawrey's benchmarking indicates that thread creation is significantly faster these days in absolute terms, but it is unclear how much of this is due improvements in Java and/or the OS ... or higher processor speeds. But his numbers still indicate a 150+ fold improvement if you use a thread pool versus creating/starting a new thread each time. (And he makes the point that this is all relative ...)
(The above assumes "native threads" rather than "green threads", but modern JVMs all use native threads for performance reasons. Green threads are possibly cheaper to create, but you pay for it in other areas.)
I've done a bit of digging to see how a Java thread's stack really gets allocated. In the case of OpenJDK 6 on Linux, the thread stack is allocated by the call to pthread_create
that creates the native thread. (The JVM does not pass pthread_create
a preallocated stack.)
Then, within pthread_create
the stack is allocated by a call to mmap
as follows:
mmap(0, attr.__stacksize,
PROT_READ|PROT_WRITE|PROT_EXEC,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
According to man mmap
, the MAP_ANONYMOUS
flag causes the memory to be initialized to zero.
Thus, even though it might not be essential that new Java thread stacks are zeroed (per the JVM spec), in practice (at least with OpenJDK 6 on Linux) they are zeroed.