What is the reason for the weird syntax of the "case" statement in a bash/zsh script?

phunehehe picture phunehehe · Nov 21, 2010 · Viewed 16.7k times · Source

Looking from a programmer's point of view then shell script is just another programming language, where one has to learn and conform to the rules of the language. However, I have to admit that this syntax is the weirdest style I have ever seen in a rather commonly used language. Did the shell take this syntax from an older language that it descents from? Is there a special implication / meaning in the syntax?

As an example, here is a little snippet that I take from another post on SO

case "$1" in
    start)
        start
        ;;
    stop)
        stop
        ;;
    restart)
        stop
        start
        ;;
    status)
        check_status
        ;;
    *)
        echo "Usage: $0 {start|stop|restart|status}"
        exit 1
        ;;
esac

Looking at this, firstly I can see that case ends with esac, which is its reversed form (like if ending in fi). Secondly I understand that each case is followed by a ). Fair enough, but why on earth do I need two ; at the end of each statement? I would also say that the ) without an accompanying ( is ugly.

I'm looking for more information about the historical aspect of the language, but I'm open for technical reasons as well.

Answer

Jonathan Leffler picture Jonathan Leffler · Nov 22, 2010

Per request:

  • So can you guess why a loop is 'for ...; do ...; done' and not 'for ...; do ...; od'? There was a sound reason for it - but the Algol-like reversed keyword to mark the end was used elsewhere.

Answer:

  • The syntax came from Bourne (of Bourne shell fame). He had worked on Algol, and liked it enough to model some of the shell syntax on Algol. Algol uses reversed keywords to mark the ends of constructs, so 'case ... esac' was appropriate. The reason that loops do not end with 'od' is that there was already a command 'od' in Unix - octal dump. So, 'done' is used instead.

By reputation, the Bourne shell source code was written in idiosyncratic C with macros to make it look like Algol. This made it hard to maintain.

With respect to the main question - about why no opening bracket (parenthesis) around the alternatives in the case statement - I have a couple of related theories.

First of all, back when the Bourne shell was written (late 1970s), much editing was done with 'ed', the standard text editor. It has no concept of skipping to a balanced parenthesis or other such notations, so there was no requirement for a leading parenthesis. Also, if you are writing a document, you might well marshal your arguments with:

a) ...blah...
b) ...more...
c) ...again...

The opening parenthesis is often omitted - and the case statement would fit into that model quite happily.

Of course, since then, we have grown used to editors that mark the matching open parenthesis when you type a close parenthesis, so the old Bourne shell notation is a nuisance. The POSIX standard makes the leading parenthesis optional; most more modern implementations of POSIX-like shells (Korn, Bash, Zsh) will support that, and I generally use it when I don't have to worry about portability to machines like Solaris 10 where /bin/sh is still a faithful Bourne shell that does not allow the leading parenthesis. (I usually deal with that by using #!/bin/ksh as the shebang.)