Understanding the tilde in Scala's parser combinators

Jano picture Jano · Jul 25, 2011 · Viewed 10.6k times · Source

I'm fairly new to Scala and while reading about parser combinators(The Magic Behind Parser Combinators, Domain-Specific Languages in Scala) I came across method definitions like this:

def classPrefix = "class" ~ ID ~ "(" ~ formals ~ ")"

I've been reading throught the API doc of scala.util.parsing.Parsers which defines a method named (tilde) but I still dont't really understand its usage in the example above. In that example (tilde) is a method which is called on java.lang.String which doesn't have that method and causes the compiler to fail. I know that (tilde) is defined as

case class ~ [+a, +b] (_1: a, _2: b)

but how does this help in the example above?

I'd be happy if someone could give me a hint to understand what's going on here. Thank you very much in advance!

Jan

Answer

Rex Kerr picture Rex Kerr · Jul 25, 2011

The structure here is a little bit tricky. First, notice that you always define these things inside a subclass of some parser, e.g. class MyParser extends RegexParsers. Now, you may note two implicit definitions inside RegexParsers:

implicit def literal (s: String): Parser[String]
implicit def regex (r: Regex): Parser[String]

What these will do is take any string or regex and convert them into a parser that matches that string or that regex as a token. They're implicit, so they'll be applied any time they're needed (e.g. if you call a method on Parser[String] that String (or Regex) does not have).

But what is this Parser thing? It's an inner class defined inside Parsers, the supertrait for RegexParser:

class Parser [+T] extends (Input) ⇒ ParseResult[T]

Looks like it's a function that takes input and maps it to a result. Well, that makes sense! And you can see the documentation for it here.

Now we can just look up the ~ method:

def ~ [U] (q: ⇒ Parser[U]): Parser[~[T, U]]
  A parser combinator for sequential composition
  p ~ q' succeeds if p' succeeds and q' succeeds on the input left over by p'.

So, if we see something like

def seaFacts = "fish" ~ "swim"

what happens is, first, "fish" does not have the ~ method, so it's implicitly converted to Parser[String] which does. The ~ method then wants an argument of type Parser[U], and so we implicitly convert "swim" into Parser[String] (i.e. U == String). Now we have something that will match an input "fish", and whatever is left in the input should match "swim", and if both are the case, then seaFacts will succeed in its match.