How do I intersect two lists in OCaml?

vstrien picture vstrien · Mar 4, 2010 · Viewed 12.1k times · Source

When I have two lists in OCaml, for example

e1 = [3; 4; 5; 6; 7]

and

e2 = [1; 3; 5; 7; 9]

Is there an efficient way to get the intersection of those two lists? I.e.:

[3; 5; 7]

Because I don't like scanning every element in list e2 for every element in list e1, thus creating a big Oh of order n^2.

Answer

Pascal Cuoq picture Pascal Cuoq · Mar 4, 2010

As Franck and Rémi said, converting your lists to sets (from stdlib module Set) costs n log(n), and then Sets provides a linear implementation of intersection. Franck also mentioned the equivalent alternative to sort the lists, and then traverse them in a synchronized way. These are roughly the same (and by the way, in both cases you need to be able to provide a total ordering on the elements in your lists).

If intersections are an important part of your algorithm and you want them to be faster in the case of two sets of elements that are only slightly different, you need to switch to a mergeable structure such as Patricia trees. See files pt* in http://www.lri.fr/~filliatr/ftp/ocaml/ds/ .

If you need intersection to be fast in all cases, you have the possibility to use hash-consed Patricia trees. Hash-consing helps to recognize structurally identical sub-trees, and help to build efficient caches for previous operations by making comparison cheap.

Patricia trees can not use an arbitrary type as key (typically they are presented with ints as keys). But you can sometimes work around this limitation by numbering at creation each value you intend to use as a key.