Best way to compare two complex objects

DmitryBoyko picture DmitryBoyko · May 4, 2012 · Viewed 241.8k times · Source

I have two complex objects like Object1 and Object2. They have around 5 levels of child objects.

I need the fastest method to say if they are same or not.

How could this be done in C# 4.0?

Answer

Douglas picture Douglas · May 4, 2012

Implement IEquatable<T> (typically in conjunction with overriding the inherited Object.Equals and Object.GetHashCode methods) on all your custom types. In the case of composite types, invoke the contained types’ Equals method within the containing types. For contained collections, use the SequenceEqual extension method, which internally calls IEquatable<T>.Equals or Object.Equals on each element. This approach will obviously require you to extend your types’ definitions, but its results are faster than any generic solutions involving serialization.

Edit: Here is a contrived example with three levels of nesting.

For value types, you can typically just call their Equals method. Even if the fields or properties were never explicitly assigned, they would still have a default value.

For reference types, you should first call ReferenceEquals, which checks for reference equality – this would serve as an efficiency boost when you happen to be referencing the same object. It would also handle cases where both references are null. If that check fails, confirm that your instance's field or property is not null (to avoid NullReferenceException) and call its Equals method. Since our members are properly typed, the IEquatable<T>.Equals method gets called directly, bypassing the overridden Object.Equals method (whose execution would be marginally slower due to the type cast).

When you override Object.Equals, you’re also expected to override Object.GetHashCode; I didn’t do so below for the sake of conciseness.

public class Person : IEquatable<Person>
{
    public int Age { get; set; }
    public string FirstName { get; set; }
    public Address Address { get; set; }

    public override bool Equals(object obj)
    {
        return this.Equals(obj as Person);
    }

    public bool Equals(Person other)
    {
        if (other == null)
            return false;

        return this.Age.Equals(other.Age) &&
            (
                object.ReferenceEquals(this.FirstName, other.FirstName) ||
                this.FirstName != null &&
                this.FirstName.Equals(other.FirstName)
            ) &&
            (
                object.ReferenceEquals(this.Address, other.Address) ||
                this.Address != null &&
                this.Address.Equals(other.Address)
            );
    }
}

public class Address : IEquatable<Address>
{
    public int HouseNo { get; set; }
    public string Street { get; set; }
    public City City { get; set; }

    public override bool Equals(object obj)
    {
        return this.Equals(obj as Address);
    }

    public bool Equals(Address other)
    {
        if (other == null)
            return false;

        return this.HouseNo.Equals(other.HouseNo) &&
            (
                object.ReferenceEquals(this.Street, other.Street) ||
                this.Street != null &&
                this.Street.Equals(other.Street)
            ) &&
            (
                object.ReferenceEquals(this.City, other.City) ||
                this.City != null &&
                this.City.Equals(other.City)
            );
    }
}

public class City : IEquatable<City>
{
    public string Name { get; set; }

    public override bool Equals(object obj)
    {
        return this.Equals(obj as City);
    }

    public bool Equals(City other)
    {
        if (other == null)
            return false;

        return
            object.ReferenceEquals(this.Name, other.Name) ||
            this.Name != null &&
            this.Name.Equals(other.Name);
    }
}

Update: This answer was written several years ago. Since then, I've started to lean away from implementing IEquality<T> for mutable types for such scenarios. There are two notions of equality: identity and equivalence. At a memory representation level, these are popularly distinguished as “reference equality” and “value equality” (see Equality Comparisons). However, the same distinction can also apply at a domain level. Suppose that your Person class has a PersonId property, unique per distinct real-world person. Should two objects with the same PersonId but different Age values be considered equal or different? The answer above assumes that one is after equivalence. However, there are many usages of the IEquality<T> interface, such as collections, that assume that such implementations provide for identity. For example, if you're populating a HashSet<T>, you would typically expect a TryGetValue(T,T) call to return existing elements that share merely the identity of your argument, not necessarily equivalent elements whose contents are completely the same. This notion is enforced by the notes on GetHashCode:

In general, for mutable reference types, you should override GetHashCode() only if:

  • You can compute the hash code from fields that are not mutable; or
  • You can ensure that the hash code of a mutable object does not change while the object is contained in a collection that relies on its hash code.