Are Pascal comments supposed to nest?

ubiquibacon picture ubiquibacon · Oct 1, 2010 · Viewed 6.9k times · Source

I have a compiler homework question that wants me to draw a DFA for Pascal comments, but I have never (and probably never will) use Pascal. The question does not specify if we should use ANSI Pascal or Turbo Pascal, so I was going to do one for both.

Some Google searches showed me that Turbo Pascal allows nested comments as long as the same delimiter is not used, so {(*comment*)} is ok and so is (*{comment}*), but {{comment}} or (*(*comment*)*) are not ok. My question here is if a comment like {(*{comment}*)} or (*{(*comment*)}*) is ok since the same delimiters are not used in a row.

I also Googled to see if ANSI Pascal allowed nested comments, but I have not found a definitive answer. If ANSI Pascal does allow nested comments, what are the rules?

On a side note, I am not worried about the strange commenting convention of different starting and ending delimiters, a convention that is allowed in ANSI Pascal.

UPDATE:

I would like to be clear that I am not really concerned with how specific compilers handle Pascal comments, I am concerned with how a compiler SHOULD handle Pascal comments if that compiler adhered 100% to either the ANSI or TURBO Pascal standards.

Answer

Rob Kennedy picture Rob Kennedy · Oct 2, 2010

ISO 7185:1990 says this about comments in §6.1.8:

Where a commentary shall be any sequence of characters and separations of lines, containing neither } nor *), the construct

( '{' | '(*' ) commentary ( '*)' | '}' )

shall be a comment if neither the { nor the (* occurs within a character-string or within a commentary.

NOTES

  1. A comment may thus commence with { and end with *), or commence with (* and end with }.
  2. The sequence (*) cannot occur in a commentary even though the sequence {) can.

According to that, there's essentially just one kind of comment. Although you can start a comment with either { or (*, you can't use different sets of characters to "wrap" other comments. Turbo Pascal, on the other hand, has two kinds of comments, those that use braces and those that use parentheses.

If you have a brace-style comment and enclose it in a parenthesis-style comment to create (*{}*), ISO says the comment is (*{} with *) left over, which will be a syntax error in your code, whereas Turbo Pascal says the comment is (*{}*) with no text left over.

Neither style lets you wrap an already-wrapped comment in yet another set of comment delimiters. That's because once you wrap a comment, the "inner" comment ceases to be treated as a comment anymore. It's just ordinary text. If you have {(**)} and you want to wrap it in a parenthesis-style comment to make (*{(**)}*), ISO and Turbo Pascal both say the comment is (*{(**) with }*) left over.

In fpc and objfpc modes, Free Pascal supports nesting comments, so my most recent example would be accepted as a valid comment with no leftover text. The FPC documentation on the matter doesn't actually demonstrate that, though. It gives six examples of supposedly nested comments:

{ Comment 1 (* comment 2 *) }  
(* Comment 1 { comment 2 } *)  
{ comment 1 // Comment 2 }  
(* comment 1 // Comment 2 *)  
// comment 1 (* comment 2 *)  
// comment 1 { comment 2 }

But Turbo Pascal (or Delphi, for the last two lines), which we've already established doesn't support nesting, would accept all of those as perfectly valid comments! FPC really does support nested comments, but those examples don't illustrate it. Here are some examples of nested comments:

{ Comment 1 { comment 2 } }
(* Comment 1 (* comment 2 *) *)
{ Comment 1 (* comment 2 { comment 3 } *) }

I'd even argue that FPC's third and fourth examples are actually counterexamples to the claim that FPC supports nested comments. Those two lines shouldn't be complete comments at all. The // in the middle of the line introduces a comment that doesn't terminate until the end of the line (EOL). The EOL comes after the } and *), so the inner, slash-style comment can't possibly have terminated by the time we reach the brace or parenthesis delimiters. Just as <a> <b> </a> </b> is improperly nested XML, we have improperly nested comments: { // } EOL.

After further experimentation, I conclude that in FPC, only comments of the same type nest. For example, once you open a comment with {, the (* characters cease to be special. FPC will not open a parenthesis-style comment inside the brace-style comment. If it did, then {(*} would be a syntax error due to the unterminated parenthesis comment. We could explain this by saying that termination of inner comments is optional when the comments are heterogeneous, but it's a simpler explanation to say that the inner comment isn't detected as a comment at all. Once you open a brace-style comment, only braces have meaning. Furthermore, // doesn't actually start a slash-style comment when it's inside braces. That's a simpler explanation that to say that FPC allows slash-style comments to be improperly nested. You can either say that slash-style comments don't nest at all, or that slash-style comments nested within other slash-style comments share a single EOL terminator.

Your assignment probably meant for you to use the Turbo Pascal style, where the comment delimiters must match. It definitely didn't expect you to use the Free Pascal style since it would have made the assignment impossible to complete — a DFA cannot accept arbitrarily nested constructs.