I’ve been using this construct since I started programming in C. I didn’t come to it myself, but I don’t recall anyone teaching it to me explicitly either. It was common knowledge back then. I was completely sure it was still common knowledge until very recently. One colleague asked me about it during a routine code inspection procedure, and I couldn’t explain it then. I’ll try to do it now.
The syntax of a macro invocation with parameters in C/C++ is identical to that of a function call. In some cases it can cause confusion but usually it is convenient, because it makes it possible to switch from function implementation to macro and back:
The same applies to void
functions:
One can call dump(x)
anywhere in the code without knowing if it is a function or a macro.
Now consider the following problem: make a void function-style macro with two (or more) statements in it. Take the following function:
Now we want to convert it into a macro:
It is easy to see why it is wrong. These are two statements instead of one, so it won’t work where exactly one statement is expected:
which will expand as
The printf
will be called if some_condition
is true, while x
will be assigned 1
in any case.
In our specific example there is another thing that is wrong: it can’t handle parameters with side-effects. It is sufficient to try calling
and see what happens.
Ok, then, maybe wrapping in curly brackets will work? It could fix both problems:
The reason this won’t work is more subtle: it is that C requires semicolon after any statement except for compound
statement. A function call is followed by a semicolon; but if we put semicolon after this macro invocation,
we’ll have two statements: a compound statement and an empty statement. There are places where exactly one statement
is allowed, for instance in the if
statement:
This will be expanded as
This won’t compile, because the empty statement (a solitary semicolon) effectively terminates any if
statements
that were opened at that point. On a positive side, it just won’t compile; previous solution compiled but didn’t work.
However, we can do better.
What we need is some structure that naturally ends with a semicolon but allows many statements inside.
This is where a do
statement comes handy:
(I remind you that a while
condition in a do
statement is a post-condition: it is checked after the iteration;
this means that this do
loop executes exactly once).
The above example is then expanded as
and this compiles correctly.
For some people this looks like an ugly hack; for others it looks like a beautiful hack. Languages that do not require semicolons (such as Go), or languages that allow multiple statements everywhere automatically (such as Modula-2) don’t experience this difficulty; but usually they don’t have macros either, so the problem is not relevant there.
P.S.
In the above example I deliberately pretended that I didn’t know the comma operator. Many multi-statement macros do not allow implementation using comma, but this specific example does. The first naïve attempt
fails miserably. C does not guarantee the order of execution of function parameters, and the behaviour is
undefined. In gcc
the order is exactly the opposite to the one we wanted, so this code
prints
Changing value from 15 to 15
This is a better code:
This code works, but suffers from mentioned side-effect issue. This can be resolved, too, by using a global variable:
but this is definitely an ugly hack.
P.P.S.
There are still cases where one can use functions but not macros of this type. I can think of two. One is where two void functions are combined using comma operator:
Neither of them can be replaced with a macro using the do
technique. Another case if a for
statement:
Here f()
and h()
can be void
functions but not macros with compound statements inside. Fortunately, these are
exotic cases.
P.P.P.S.
There is at least one more language construct (apart from do
) that can be used this way. This is the if
statement: