How to use the IEqualityComparer

Akrem picture Akrem · Jul 14, 2011 · Viewed 166.5k times · Source

I have some bells in my database with the same number. I want to get all of them without duplication. I created a compare class to do this work, but the execution of the function causes a big delay from the function without distinct, from 0.6 sec to 3.2 sec!

Am I doing it right or do I have to use another method?

reg.AddRange(
    (from a in this.dataContext.reglements
     join b in this.dataContext.Clients on a.Id_client equals b.Id
     where a.date_v <= datefin && a.date_v >= datedeb
     where a.Id_client == b.Id
     orderby a.date_v descending 
     select new Class_reglement
     {
         nom  = b.Nom,
         code = b.code,
         Numf = a.Numf,
     })
    .AsEnumerable()
    .Distinct(new Compare())
    .ToList());

class Compare : IEqualityComparer<Class_reglement>
{
    public bool Equals(Class_reglement x, Class_reglement y)
    {
        if (x.Numf == y.Numf)
        {
            return true;
        }
        else { return false; }
    }
    public int GetHashCode(Class_reglement codeh)
    {
        return 0;
    }
}

Answer

Konrad Rudolph picture Konrad Rudolph · Jul 14, 2011

Your GetHashCode implementation always returns the same value. Distinct relies on a good hash function to work efficiently because it internally builds a hash table.

When implementing interfaces of classes it is important to read the documentation, to know which contract you’re supposed to implement.1

In your code, the solution is to forward GetHashCode to Class_reglement.Numf.GetHashCode and implement it appropriately there.

Apart from that, your Equals method is full of unnecessary code. It could be rewritten as follows (same semantics, ¼ of the code, more readable):

public bool Equals(Class_reglement x, Class_reglement y)
{
    return x.Numf == y.Numf;
}

Lastly, the ToList call is unnecessary and time-consuming: AddRange accepts any IEnumerable so conversion to a List isn’t required. AsEnumerable is also redundant here since processing the result in AddRange will cause this anyway.


1 Writing code without knowing what it actually does is called cargo cult programming. It’s a surprisingly widespread practice. It fundamentally doesn’t work.