In a good overview, Dennie Van Tassel outlined four different types of comments:
- Full line comments, this is exemplified by REM in BASIC: The line only contains the comment and it runs to the end of the line.
- End-of-Line comments, in C/C++ that would be //
- Block comments, '/* ... */' in C/C++
- Mega comments, which in C/C++ can be emulated by using // on every line or using the preprocessor with #if 0 ... #endif
We can ignore the full line comments, they're completely covered by end-of-line comments, and C3 already has those and /* ... */ block comments.
However, the mega comments poses a problem. In C3 the analogue to #if 0 ... #endif is $if 0 ... $endif, but it would require the code inside to parse.
Since a typical case for using mega comments would actually be to copy a slab of C code inside of comments and then convert it piecemeal $if 0 doesn't work.
What about making /* ... */ nesting?
In an article from 2017 titled Block Comments are a Bad Idea Troels Henriksen argues that adding nesting to block comments does not really solve the problem and shows the following example from Haskell which uses {- ... -} for nested comments:
1 2 3 | {- s = "{-"; -} |
In the above example the {- inside of the string inadvertently opens a new nested comment. He rejects the idea that the lexer (or even worse, *the parser*) should track strings inside of comments. Instead Henriksen argues for either using #if 0 or // on every line. While the latter is exactly what Zig picked, it relies too much on the text editor for my taste.
Looking at D, it introduces a new nested comment /+ ... +/. It acts just like /* ... */ except it is nested. Initially this was what I picked for C3.
However it has drawbacks:
- It introduces another comment type that is only marginally different from the others.
- It can have the s = "/+" problem just like "/*" – we just moved the problem.
- For beginners coming from C it's not obvious that this comment type is available, so it may get under used.
- It does not visually indicate that it should be used for mega comments rather than regular comments.
There's another point as well: #if 0 ... #endif can never have the s = "/*" issue by virtue of always starting and ending on its own line.
Doing some research I tried to determine if there was some "obvious" syntax that could convey the #if 0 ... #endif behaviour. I had a lot of examples (that I hated), like /--- ... ---/ /--> ... /<--- and even ideas of a heredoc style comment like /$FOO ... /$$FOO.
Ultimately I decided to pick /# ... #/ for these block comments, which acted like nested comments but were required to be on a new line which bypasses this problem:
1 2 3 | /# s = "/#"; <- not recognized #/ |
But it turns out that this has issues of its own. What if you by accident write something like:
1 2 3 | /# int x; int y = foo(); #/ |
or
1 2 3 | foo() /# int x; #/ |
You need a good heuristic to figure out a nice error message for these. For example you could either always decide that /#foo is /# + foo or maybe it's only like that if the /# starts a line, otherwise it's interpreted as / + #foo (which can be valid C3).
But after playing around with this for a while, I had to say that the value from this seemed much less than I had hoped. Yes, it's distinct, but it has most of the problems with /+ ... +/ in terms of lack of familiarity. And if I'm honest with myself, I'm personally still mostly using /* ... */ over #if 0 ... #endif where I can.
So we've come full circle: nesting /* */ to distinct nesting block comments, to #if 0 ... #endif and now back to perhaps nesting /* ... */?
For now at least, C3 will add nesting to /* ... */ and remove /+ ... +/. This is an imperfect solution, but possibly also a reasonable trade off to keep the language familiar with features that pull their weight.