Find all intersecting data, not just the unique values

Peterdk picture Peterdk · Feb 1, 2010 · Viewed 9.5k times · Source

I thought that I understood Intersect, but it turns out I was wrong.

 List<int> list1 = new List<int>() { 1, 2, 3, 2, 3};
 List<int> list2 = new List<int>() { 2, 3, 4, 3, 4};

 list1.Intersect(list2) =>      2,3

 //But what I want is:
 // =>  2,3,2,3,2,3,3

I can figure a way like:

 var intersected = list1.Intersect(list2);
 var list3 = new List<int>();
 list3.AddRange(list1.Where(I => intersected.Contains(I)));
 list3.AddRange(list2.Where(I => intersected.Contains(I)));

Is there a easier way in LINQ to achieve this?

I do need to state that I do not care in which order the results are given.

2,2,2,3,3,3,3 would also be perfectly OK.

Problem is that I am using this on a very large collection, So I need efficiency.

We are talking about Objects, not ints. The ints were just for the easy example, but I realize this can make a difference.

Answer

Eric Lippert picture Eric Lippert · Feb 1, 2010

Let's see if we can precisely characterize what you want. Correct me if I am wrong. You want: all elements of list 1, in order, that also appear in list 2, followed by all elements of list 2, in order, that also appear in list 1. Yes?

Seems straightforward.

return list1.Where(x=>list2.Contains(x))
     .Concat(list2.Where(y=>list1.Contains(y)))
     .ToList();

Note that this is not efficient for large lists. If the lists have a thousand items each then this does a couple million comparisons. If you're in that situation then you want to use a more efficient data structure for testing membership:

list1set = new HashSet(list1);
list2set = new HashSet(list2);

return list1.Where(x=>list2set.Contains(x))
     .Concat(list2.Where(y=>list1set.Contains(y)))
     .ToList();

which only does a couple thousand comparisons, but potentially uses more memory.