What's the best way to cache expensive data obtained from reflection? For example most fast serializers cache such information so they don't need to reflect every time they encounter the same type again. They might even generate a dynamic method which they look up from the type.
Traditionally I've used a normal static dictionary for that. For example:
private static ConcurrentDictionary<Type, Action<object>> cache;
public static DoSomething(object o)
{
Action<object> action;
if(cache.TryGetValue(o.GetType(), out action)) //Simple lookup, fast!
{
action(o);
}
else
{
// Do reflection to get the action
// slow
}
}
This leaks a bit of memory, but since it does that only once per Type and types lived as long as the AppDomain
I didn't consider that a problem.
But now .net 4 introduced Collectible Assemblies for Dynamic Type Generation. If I ever used DoSomething
on an object declared in the collectible assembly that assembly won't ever get unloaded. Ouch.
So what's the best way to cache per type information in .net 4 that doesn't suffer from this problem? The easiest solution I can think of is a:
private static ConcurrentDictionary<WeakReference, TCachedData> cache.
But the IEqualityComparer<T>
I'd have to use with that would behave very strangely and would probably violate the contract too. I'm not sure how fast the lookup would be either.
Another idea is to use an expiration timeout. Might be the simplest solution, but feels a bit inelegant.
In the cases where the type is supplied as generic parameter I can use a nested generic class which should not suffer from this problem. But his doesn't work if the type is supplied in a variable.
class MyReflection
{
internal Cache<T>
{
internal static TData data;
}
void DoSomething<T>()
{
DoSomethingWithData(Cache<T>.data);
//Obviously simplified, should have similar creation logic to the previous code.
}
}
Update: One idea I've just had is using Type.AssemblyQualifiedName
as the key. That should uniquely identify that type without keeping it in memory. I might even get away with using referential identity on this string.
One problem that remains with this solution is that the cached value might keep a reference to the type too. And if I use a weak reference for that it will most likely expire far before the assembly gets unloaded. And I'm not sure how cheap it is to Get a normal reference out of a weak reference. Looks like I need to do some testing and benchmarking.
ConcurrentDictionary<WeakReference, CachedData>
is incorrect in this case. Suppose we are trying to cache info for type T, so WeakReference.Target==typeof(T)
. CachedData most likely will contain reference for typeof(T)
also. As ConcurrentDictionary<TKey, TValue>
stores items in the internal collection of Node<TKey, TValue>
you will have chain of strong references: ConcurrentDictionary
instance -> Node
instance -> Value
property (CachedData
instance) -> typeof(T)
. In general it is impossible to avoid memory leak with WeakReference in the case when Values could have references to their Keys.
It was necessary to add support for ephemerons to make such scenario possible without memory leaks. Fortunately .NET 4.0 supports them and we have ConditionalWeakTable<TKey, TValue>
class. It seems the reasons to introduce it are close to your task.
This approach also solves problem mentioned in your update as reference to Type will live exactly as long as Assembly is loaded.