When should I choose Vector in Scala?

Duncan McGregor picture Duncan McGregor · Aug 3, 2011 · Viewed 55.1k times · Source

It seems that Vector was late to the Scala collections party, and all the influential blog posts had already left.

In Java ArrayList is the default collection - I might use LinkedList but only when I've thought through an algorithm and care enough to optimise. In Scala should I be using Vector as my default Seq, or trying to work out when List is actually more appropriate?

Answer

Daniel Spiewak picture Daniel Spiewak · Aug 4, 2011

As a general rule, default to using Vector. It’s faster than List for almost everything and more memory-efficient for larger-than-trivial sized sequences. See this documentation of the relative performance of Vector compared to the other collections. There are some downsides to going with Vector. Specifically:

  • Updates at the head are slower than List (though not by as much as you might think)

Another downside before Scala 2.10 was that pattern matching support was better for List, but this was rectified in 2.10 with generalized +: and :+ extractors.

There is also a more abstract, algebraic way of approaching this question: what sort of sequence do you conceptually have? Also, what are you conceptually doing with it? If I see a function that returns an Option[A], I know that function has some holes in its domain (and is thus partial). We can apply this same logic to collections.

If I have a sequence of type List[A], I am effectively asserting two things. First, my algorithm (and data) is entirely stack-structured. Second, I am asserting that the only things I’m going to do with this collection are full, O(n) traversals. These two really go hand-in-hand. Conversely, if I have something of type Vector[A], the only thing I am asserting is that my data has a well defined order and a finite length. Thus, the assertions are weaker with Vector, and this leads to its greater flexibility.