C# 3.0: Need to return duplicates from a List<>

SaaS Developer picture SaaS Developer · Jan 29, 2009 · Viewed 7.6k times · Source

I have a List<> of objects in C# and I need a way to return those objects that are considered duplicates within the list. I do not need the Distinct resultset, I need a list of those items that I will be deleting from my repository.

For the sake of this example, lets say I have a list of "Car" types and I need to know which of these cars are the same color as another in the list. Here are the cars in the list and their color property:

Car1.Color = Red;

Car2.Color = Blue;

Car3.Color = Green;

Car4.Color = Red;

Car5.Color = Red;

For this example I need the result (IEnumerable<>, List<>, or whatever) to contain Car4 and Car5 because I want to delete these from my repository or db so that I only have one car per color in my repository. Any help would be appreciated.

Answer

Jon Skeet picture Jon Skeet · Jan 29, 2009

I inadvertently coded this yesterday, when I was trying to write a "distinct by a projection". I included a ! when I shouldn't have, but this time it's just right:

public static IEnumerable<TSource> DuplicatesBy<TSource, TKey>
    (this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
    HashSet<TKey> seenKeys = new HashSet<TKey>();
    foreach (TSource element in source)
    {
        // Yield it if the key hasn't actually been added - i.e. it
        // was already in the set
        if (!seenKeys.Add(keySelector(element)))
        {
            yield return element;
        }
    }
}

You'd then call it with:

var duplicates = cars.DuplicatesBy(car => car.Color);