Haskell record syntax

Rob Agar picture Rob Agar · Mar 20, 2011 · Viewed 22.4k times · Source

Haskell's record syntax is considered by many to be a wart on an otherwise elegant language, on account of its ugly syntax and namespace pollution. On the other hand it's often more useful than the position based alternative.

Instead of a declaration like this:

data Foo = Foo { 
  fooID :: Int, 
  fooName :: String 
} deriving (Show)

It seems to me that something along these lines would be more attractive:

data Foo = Foo id   :: Int
               name :: String
               deriving (Show)

I'm sure there must be a good reason I'm missing, but why was the C-like record syntax adopted over a cleaner layout-based approach?

Secondly, is there anything in the pipeline to solve the namespace problem, so we can write id foo instead of fooID foo in future versions of Haskell? (Apart from the longwinded type class based workarounds currently available.)

Answer

Dan Burton picture Dan Burton · Apr 2, 2011

Well if no one else is going to try, then I'll take another (slightly more carefully researched) stab at answering these questions.

tl;dr

Question 1: That's just the way the dice rolled. It was a circumstantial choice and it stuck.

Question 2: Yes (sorta). Several different parties have certainly been thinking about the issue.

Read on for a very longwinded explanation for each answer, based around links and quotes that I found to be relevant and interesting.

Why was the C-like record syntax adopted over a cleaner layout-based approach?

Microsoft researchers wrote a History of Haskell paper. Section 5.6 talks about records. I'll quote the first tiny bit, which is insightful:

One of the most obvious omissions from early versions of Haskell was the absence of records, offering named fields. Given that records are extremely useful in practice, why were they omitted?

The Microsofties then answer their own question

The strongest reason seems to have been that there was no obvious “right” design.

You can read the paper yourself for the details, but they say Haskell eventually adopted record syntax due to "pressure for named fields in data structures".

By the time the Haskell 1.3 design was under way, in 1993, the user pressure for named fields in data structures was strong, so the committee eventually adopted a minimalist design...

You ask why it is why it is? Well, from what I understand, if the early Haskellers had their way, we might've never had record syntax in the first place. The idea was apparently pushed onto Haskell by people who were already used to C-like syntax, and were more interested in getting C-like things into Haskell rather than doing things "the Haskell way". (Yes, I realize this is an extremely subjective interpretation. I could be dead wrong, but in the absence of better answers, this is the best conclusion I can draw.)

Is there anything in the pipeline to solve the namespace problem?

First of all, not everyone feels it is a problem. A few weeks ago, a Racket enthusiast explained to me (and others) that having different functions with the same name was a bad idea, because it complicates analysis of "what does the function named ___ do?" It is not, in fact, one function, but many. The idea can be extra troublesome for Haskell, since it complicates type inference.

On a slight tangent, the Microsofties have interesting things to say about Haskell's typeclasses:

It was a happy coincidence of timing that Wadler and Blott happened to produce this key idea at just the moment when the language design was still in flux.

Don't forget that Haskell was young once. Some decisions were made simply because they were made.

Anyways, there are a few interesting ways that this "problem" could be dealt with:

Type Directed Name Resolution, a proposed modification to Haskell (mentioned in comments above). Just read that page to see that it touches a lot of areas of the language. All in all, it ain't a bad idea. A lot of thought has been put into it so that it won't clash with stuff. However, it will still require significantly more attention to get it into the now-(more-)mature Haskell language.

Another Microsoft paper, OO Haskell, specifically proposes an extension to the Haskell language to support "ad hoc overloading". It's rather complicated, so you'll just have to check out Section 4 for yourself. The gist of it is to automatically (?) infer "Has" types, and to add an additional step to type checking that they call "improvement", vaguely outlined in the selective quotes that follow:

Given the class constraint Has_m (Int -> C -> r) there is only one instance for m that matches this constraint...Since there is exactly one choice, we should make it now, and that in turn fixes r to be Int. Hence we get the expected type for f: f :: C -> Int -> IO Int...[this] is simply a design choice, and one based on the idea that the class Has_m is closed

Apologies for the incoherent quoting; if that helps you at all, then great, otherwise just go read the paper. It's a complicated (but convincing) idea.

Chris Done has used Template Haskell to provide duck typing in Haskell in a vaguely similar manner to the OO Haskell paper (using "Has" types). A few interactive session samples from his site:

λ> flap ^. donald
*Flap flap flap*
λ> flap ^. chris
I'm flapping my arms!

fly :: (Has Flap duck) => duck -> IO ()
fly duck = do go; go; go where go = flap ^. duck

λ> fly donald
*Flap flap flap*
*Flap flap flap*
*Flap flap flap*

This requires a little boilerplate/unusual syntax, and I personally would prefer to stick to typeclasses. But kudos to Chris Done for freely publishing his down-to-earth work in the area.