Is the Scala 2.8 collections library a case of "the longest suicide note in history"?

oxbow_lakes picture oxbow_lakes · Nov 12, 2009 · Viewed 113.8k times · Source

I have just started to look at the Scala collections library re-implementation which is coming in the imminent 2.8 release. Those familiar with the library from 2.7 will notice that the library, from a usage perspective, has changed little. For example...

> List("Paris", "London").map(_.length)
res0: List[Int] List(5, 6)

...would work in either versions. The library is eminently useable: in fact it's fantastic. However, those previously unfamiliar with Scala and poking around to get a feel for the language now have to make sense of method signatures like:

def map[B, That](f: A => B)(implicit bf: CanBuildFrom[Repr, B, That]): That

For such simple functionality, this is a daunting signature and one which I find myself struggling to understand. Not that I think Scala was ever likely to be the next Java (or /C/C++/C#) - I don't believe its creators were aiming it at that market - but I think it is/was certainly feasible for Scala to become the next Ruby or Python (i.e. to gain a significant commercial user-base)

  • Is this going to put people off coming to Scala?
  • Is this going to give Scala a bad name in the commercial world as an academic plaything that only dedicated PhD students can understand? Are CTOs and heads of software going to get scared off?
  • Was the library re-design a sensible idea?
  • If you're using Scala commercially, are you worried about this? Are you planning to adopt 2.8 immediately or wait to see what happens?

Steve Yegge once attacked Scala (mistakenly in my opinion) for what he saw as its overcomplicated type-system. I worry that someone is going to have a field day spreading FUD with this API (similarly to how Josh Bloch scared the JCP out of adding closures to Java).

Note - I should be clear that, whilst I believe that Joshua Bloch was influential in the rejection of the BGGA closures proposal, I don't ascribe this to anything other than his honestly-held beliefs that the proposal represented a mistake.


Despite whatever my wife and coworkers keep telling me, I don't think I'm an idiot: I have a good degree in mathematics from the University of Oxford, and I've been programming commercially for almost 12 years and in Scala for about a year (also commercially).

Note the inflammatory subject title is a quotation made about the manifesto of a UK political party in the early 1980s. This question is subjective but it is a genuine question, I've made it CW and I'd like some opinions on the matter.

Answer

Martin Odersky picture Martin Odersky · Nov 13, 2009

I hope it's not a "suicide note", but I can see your point. You hit on what is at the same time both a strength and a problem of Scala: its extensibility. This lets us implement most major functionality in libraries. In some other languages, sequences with something like map or collect would be built in, and nobody has to see all the hoops the compiler has to go through to make them work smoothly. In Scala, it's all in a library, and therefore out in the open.

In fact the functionality of map that's supported by its complicated type is pretty advanced. Consider this:

scala> import collection.immutable.BitSet
import collection.immutable.BitSet

scala> val bits = BitSet(1, 2, 3)
bits: scala.collection.immutable.BitSet = BitSet(1, 2, 3)

scala> val shifted = bits map { _ + 1 }
shifted: scala.collection.immutable.BitSet = BitSet(2, 3, 4)

scala> val displayed = bits map { _.toString + "!" }
displayed: scala.collection.immutable.Set[java.lang.String] = Set(1!, 2!, 3!)

See how you always get the best possible type? If you map Ints to Ints you get again a BitSet, but if you map Ints to Strings, you get a general Set. Both the static type and the runtime representation of map's result depend on the result type of the function that's passed to it. And this works even if the set is empty, so the function is never applied! As far as I know there is no other collection framework with an equivalent functionality. Yet from a user perspective this is how things are supposed to work.

The problem we have is that all the clever technology that makes this happen leaks into the type signatures which become large and scary. But maybe a user should not be shown by default the full type signature of map? How about if she looked up map in BitSet she got:

map(f: Int => Int): BitSet     (click here for more general type)

The docs would not lie in that case, because from a user perspective indeed map has the type (Int => Int) => BitSet. But map also has a more general type which can be inspected by clicking on another link.

We have not yet implemented functionality like this in our tools. But I believe we need to do this, to avoid scaring people off and to give more useful info. With tools like that, hopefully smart frameworks and libraries will not become suicide notes.