Why can't I retrieve an item from a HashSet without enumeration?

sooniln picture sooniln · Sep 29, 2009 · Viewed 26.9k times · Source

I'm looking for insight into the heads of HashSet designers. As far as I am aware, my question applies to both Java and C# HashSets, making me think there must be some good reason for it, though I can't think of any myself.

After I have inserted an item into a HashSet, why is it impossible to retrieve that item without enumeration, hardly an efficient operation? Especially since a HashSet is explicitly built in a way which supports efficient retrieval.

It would often be useful to me to have Remove(x) and Contains(x) return the actual item that is being removed or contained. This is not necessarily the item I pass into the Remove(x) or Contains(x) function. Sure, I guess I could achieve the same effect through a HashMap but why waste all that space and effort when it should be perfectly possible to do this with a set?

I can appreciate that there may be some design concerns that adding this functionality would allows uses of HashSet which are not consistent with their role or future role in the framework, but if this is so, what are these design concerns?

Edit

To answer some more questions, here are more details:

I am using an immutable reference type with overridden hashcode, equals, etc to emulate a value type in C#. Let's say the type has members A, B, and C. Hashcode, equals, etc depend only on A and B. Given some A and B I want to be able to retrieve that equivalent item from a hashset and get it's C. I won't be able to use HashSet for this it appears, but I would at least like to know if there is any good reason for this. Pseudo code follows:

public sealed class X{
 object A;
 object B;
 object extra;

 public int HashCode(){
  return A.hashCode() + B.hashCode();
 }

 public bool Equals(X obj){
  return obj.A == A && obj.B == B;
 }
}

hashset.insert(new X(1,2, extra1));
hashset.contains(new X(1,2)); //returns true, but I can't retrieve extra

Answer

leat picture leat · Jan 26, 2011

In .Net, what you are probably looking for is KeyedCollection http://msdn.microsoft.com/en-us/library/ms132438.aspx

You can get around the nastiness of re-implementing this abstract class each time with some "generic" cleverness. (See IKeyedObject`1.)

Note: Any data transfer object which implements IKeyedObject`1 should have an overridden GetHashCode method simply returning this.Key.GetHashCode(); and same goes for equals...

My Base Class Library usually ends up with something like this in it:

public class KeyedCollection<TItem> : System.Collections.ObjectModel.KeyedCollection<TItem, TItem>
    where TItem : class
{
    public KeyedCollection() : base()
    {
    }

    public KeyedCollection(IEqualityComparer<TItem> comparer) : base(comparer)
    {
    }

    protected override TItem GetKeyForItem(TItem item)
    {
        return item;
    }
}

public class KeyedObjectCollection<TKey, TItem> : System.Collections.ObjectModel.KeyedCollection<TKey, TItem>
    where TItem : class, IKeyedObject<TKey>
    where TKey : struct
{
    public KeyedCollection() : base()
    {
    }

    protected override TItem GetKeyForItem(TItem item)
    {
        return item.Key;
    }
}

///<summary>
/// I almost always implement this explicitly so the only
/// classes that have access without some rigmarole
/// are generic collections built to be aware that an object
/// is keyed.
///</summary>
public interface IKeyedObject<TKey>
{
    TKey Key { get; }
}