In my Java classes, I sometimes make use of a ThreadLocal
mainly as a means of avoiding unnecessary object creation:
@net.jcip.annotations.ThreadSafe
public class DateSensitiveThing {
private final Date then;
public DateSensitiveThing(Date then) {
this.then = then;
}
private static final ThreadLocal<Calendar> threadCal = new ThreadLocal<Calendar>() {
@Override
protected Calendar initialValue() {
return new GregorianCalendar();
}
};
public Date doCalc(int n) {
Calendar c = threadCal.get();
c.setTime(this.then):
// use n to mutate c
return c.getTime();
}
}
I do this for the proper reason - GregorianCalendar
is one of those gloriously stateful, mutable, non-threadsafe objects, which provides a service across multiple calls, rather than representing a value. Further, it is considered to be 'expensive' to instantiate (whether this is true or not is not the point of this question). (Overall, I really admire it :-))
However, if I use such a class in any environment which pools threads - and where my application is not in control of the lifecycle of those threads - then there is the potential for memory leaks. A Servlet environment is an good example.
In fact, Tomcat 7 whinges like so when a webapp is stopped:
SEVERE: The web application [] created a ThreadLocal with key of type [org.apache.xmlbeans.impl.store.CharUtil$1] (value [org.apache.xmlbeans.impl.store.CharUtil$1@2aace7a7]) and a value of type [java.lang.ref.SoftReference] (value [java.lang.ref.SoftReference@3d9c9ad4]) but failed to remove it when the web application was stopped. Threads are going to be renewed over time to try and avoid a probable memory leak. Dec 13, 2012 12:54:30 PM org.apache.catalina.loader.WebappClassLoader checkThreadLocalMapForLeaks
(Not even my code doing it, in that particular case).
This hardly seems fair. Tomcat is blaming me (or the user of my class) for doing the right thing.
Ultimately, that's because Tomcat wants to reuse threads it offered to me, for other web apps. (Ugh - I feel dirty.) Probably, it's not a great policy on Tomcat's part - because threads actually do have/cause state - don't share 'em between applications.
However, this policy is at least common, even if it is not desirable. I feel that I'm obliged - as a ThreadLocal
user, to provide a way for my class to 'release' the resources which my class has attached to various threads.
What is the right thing to do here?
To me, it seems like the servlet engine's thread-reuse policy is at odds with the intent behind ThreadLocal
.
But maybe I should provide a facility to allow users to say "begone, evil thread-specific state associated with this class, even though I am in no position to let the thread die and let GC do its thing?". Is it even possible for me to do this? I mean, it's not like I can arrange for ThreadLocal#remove()
to be called on each of the Threads which saw ThreadLocal#initialValue()
at some time in the past. Or is there another way?
Or should I just say to my users "go and get yourself a decent classloader and thread pool implementation"?
EDIT#1: Clarified how threadCal
is used in a vanailla utility class which is unaware of thread lifecycles
EDIT#2: Fixed a thread safety issue in DateSensitiveThing
Well, a bit late to the party on this one. In October 2007, Josh Bloch (co-author of java.lang.ThreadLocal
along with Doug Lea) wrote:
"The use of thread pools demands extreme care. Sloppy use of thread pools in combination with sloppy use of thread locals can cause unintended object retention, as has been noted in many places."
People were complaining about the bad interaction of ThreadLocal with thread pools even then. But Josh did sanction:
"Per-thread instances for performance. Aaron's SimpleDateFormat example (above) is one example of this pattern."
ThreadLocal
, you have limited options for doing that. Either:
a) you know that the Thread
(s) where you put values will terminate when your application is finished; OR
b) you can later arrange for same thread that invoked ThreadLocal#set() to invoke ThreadLocal#remove() whenever your application terminatesIn short, deciding to use a ThreadLocal as a form of fast, uncontended access to "per thread instance pools" is not a decision to be taken lightly.
NOTE: There are other uses of ThreadLocal other than 'object pools', and these lessons do not apply to those scenarios where the ThreadLocal is only intended to be set on a temporary basis anyway, or where there is genuine per-thread state to keep track of.
Threre are some consequences for library implementors (even where such libraries are simple utility classes in your project).
Either:
java.util.concurrent.ThreadLocalRandom
, it might be appropriate. (Tomcat might still whinge at users of your library, if you aren't implementing in java.*
). It's interesting to note the discipline with which java.*
makes sparing use of the ThreadLocal technique.OR
LibClass.releaseThreadLocalsForThread()
when I am finished with them.Makes your library 'hard to use properly', though.
OR
new ExpensiveObjectFactory<T>() { public T get() {...} }
if you think it is really neccesasry".Not so bad. If the object are really that important and that expensive to create, explicit pooling is probably worthwhile.
OR
servletContext.addThreadCleanupHandler(new Handler() {@Override cleanup() {...}})
It'd be nice to see some standardisation around the last 3 items, in future JavaEE specs.
Actually, instantiation of a GregorianCalendar
is pretty lightweight. It's the unavoidable call to setTime()
which incurs most of the work. It also doesn't hold any significant state between different points of a thread's exeuction. Putting a Calendar
into a ThreadLocal
is unlikely to give you back more than it costs you ... unless profiling definitely shows a hot spot in new GregorianCalendar()
.
new SimpleDateFormat(String)
is expensive by comparison, because it has to parse the format string. Once parsed, the 'state' of the object is significant to later uses by the same thread. It's a better fit. But it might still be 'less expensive' to instantiate a new one, than give your classes extra responsibilities.