efficiently knowing if intersection of two list is empty or not, in python

Manuel Araoz picture Manuel Araoz · Feb 4, 2010 · Viewed 19.1k times · Source

Suppose I have two lists, L and M. Now I want to know if they share an element. Which would be the fastest way of asking (in python) if they share an element? I don't care which elements they share, or how many, just if they share or not.

For example, in this case

L = [1,2,3,4,5,6]
M = [8,9,10]

I should get False, and here:

L = [1,2,3,4,5,6]
M = [5,6,7]

I should get True.

I hope the question's clear. Thanks!

Manuel

Answer

John La Rooy picture John La Rooy · Feb 4, 2010

Or more concisely

if set(L) & set(M):
    # there is an intersection
else:
    # no intersection

If you really need True or False

bool(set(L) & set(M))

After running some timings, this seems to be a good option to try too

m_set=set(M)
any(x in m_set  for x in L)

If the items in M or L are not hashable you have to use a less efficient approach like this

any(x in M for x in L)

Here are some timings for 100 item lists. Using sets is considerably faster when there is no intersection, and a bit slower when there is a considerable intersection.

M=range(100)
L=range(100,200)

timeit set(L) & set(M)
10000 loops, best of 3: 32.3 µs per loop

timeit any(x in M for x in L)
1000 loops, best of 3: 374 µs per loop

timeit m_set=frozenset(M);any(x in m_set  for x in L)
10000 loops, best of 3: 31 µs per loop

L=range(50,150)

timeit set(L) & set(M)
10000 loops, best of 3: 18 µs per loop

timeit any(x in M for x in L)
100000 loops, best of 3: 4.88 µs per loop

timeit m_set=frozenset(M);any(x in m_set  for x in L)
100000 loops, best of 3: 9.39 µs per loop


# Now for some random lists
import random
L=[random.randrange(200000) for x in xrange(1000)]
M=[random.randrange(200000) for x in xrange(1000)]

timeit set(L) & set(M)
1000 loops, best of 3: 420 µs per loop

timeit any(x in M for x in L)
10 loops, best of 3: 21.2 ms per loop

timeit m_set=set(M);any(x in m_set  for x in L)
1000 loops, best of 3: 168 µs per loop

timeit m_set=frozenset(M);any(x in m_set  for x in L)
1000 loops, best of 3: 371 µs per loop