Experiments in program optimisation

One handy C idiom: multi-statement macro

Tags: C macro idiom

05 May 2014

I’ve been using this construct since I started programming in C. I didn’t come to it myself, but I don’t recall anyone teaching it to me explicitly either. It was common knowledge back then. I was completely sure it was still common knowledge until very recently. One colleague asked me about it during a routine code inspection procedure, and I couldn’t explain it then. I’ll try to do it now.

The syntax of a macro invocation with parameters in C/C++ is identical to that of a function call. In some cases it can cause confusion but usually it is convenient, because it makes it possible to switch from function implementation to macro and back:

#if MACRO
    #define add(x,y) ((x)+(y))
#else
    static int add (int x, int y) { return x + y; }
#endif

The same applies to void functions:

#if MACRO
    #define dump(x) printf ("x=%d\n", x)
#else
    static void dump (int x) { printf ("x=%d\n", x); }
#endif

One can call dump(x) anywhere in the code without knowing if it is a function or a macro.

Now consider the following problem: make a void function-style macro with two (or more) statements in it. Take the following function:

void set_value (int new_value)
{
    printf ("Changing value from %d to %d\n", x, new_value);
    x = new_value;
}

Now we want to convert it into a macro:

#define set_value(new_value) printf ("Changing value from %d to %d\n",\
                                     x, new_value);\
                             x = new_value

It is easy to see why it is wrong. These are two statements instead of one, so it won’t work where exactly one statement is expected:

If (some_condition)
    set_value (1);

which will expand as

If (some_condition)
    printf ("Changing value from %d to %d\n",
            x, 1);
    x = 1;

The printf will be called if some_condition is true, while x will be assigned 1 in any case.

In our specific example there is another thing that is wrong: it can’t handle parameters with side-effects. It is sufficient to try calling

    set_value (y++);

and see what happens.

Ok, then, maybe wrapping in curly brackets will work? It could fix both problems:

#define set_value(new_value)  {\
        int v = new_value;\
        printf ("Changing value from %d to %d\n", x, v);\
        x = v;\
    }

The reason this won’t work is more subtle: it is that C requires semicolon after any statement except for compound statement. A function call is followed by a semicolon; but if we put semicolon after this macro invocation, we’ll have two statements: a compound statement and an empty statement. There are places where exactly one statement is allowed, for instance in the if statement:

if (condition)
    set_value (1);
else
    something_else ();

This will be expanded as

if (condition)
{
    int v = 1;
    printf ("Changing value from %d to %d\n", x, v);
    x = v;
};
else
    something_else ();

This won’t compile, because the empty statement (a solitary semicolon) effectively terminates any if statements that were opened at that point. On a positive side, it just won’t compile; previous solution compiled but didn’t work. However, we can do better.

What we need is some structure that naturally ends with a semicolon but allows many statements inside. This is where a do statement comes handy:

#define set_value(new_value)  do {\
    int v = new_value;\
    printf ("Changing value from %d to %d\n", x, v);\
    x = v;\
} while (0)

(I remind you that a while condition in a do statement is a post-condition: it is checked after the iteration; this means that this do loop executes exactly once).

The above example is then expanded as

if (condition)
    do {
        int v = 1;
        printf ("Changing value from %d to %d\n", x, v);
        x = v;
    } while (0);
else
    something_else ();

and this compiles correctly.

For some people this looks like an ugly hack; for others it looks like a beautiful hack. Languages that do not require semicolons (such as Go), or languages that allow multiple statements everywhere automatically (such as Modula-2) don’t experience this difficulty; but usually they don’t have macros either, so the problem is not relevant there.

P.S.

In the above example I deliberately pretended that I didn’t know the comma operator. Many multi-statement macros do not allow implementation using comma, but this specific example does. The first naïve attempt

#define set_value(new_value) printf ("Changing value from %d to %d\n",\
                                     x, x = new_value)

fails miserably. C does not guarantee the order of execution of function parameters, and the behaviour is undefined. In gcc the order is exactly the opposite to the one we wanted, so this code

x = 10;
set_value(15);

prints

Changing value from 15 to 15

This is a better code:

#define set_value(new_value) printf ("Changing value from %d to %d\n",\
                                     x, new_value)\
                             , x = new_value

This code works, but suffers from mentioned side-effect issue. This can be resolved, too, by using a global variable:

int v;
#define set_value(new_value) v = new_value \
                             , printf ("Changing value from %d to %d\n", x, v)\
                             , x = v

but this is definitely an ugly hack.

P.P.S.

There are still cases where one can use functions but not macros of this type. I can think of two. One is where two void functions are combined using comma operator:

f(), g();

Neither of them can be replaced with a macro using the do technique. Another case if a for statement:

for (f(); g(); h())
   p();

Here f() and h() can be void functions but not macros with compound statements inside. Fortunately, these are exotic cases.

P.P.P.S.

There is at least one more language construct (apart from do) that can be used this way. This is the if statement:

#define set_value(new_value)  if (1) {\
    int v = new_value;\
    printf ("Changing value from %d to %d\n", x, v);\
    x = v;\
} else
comments powered by Disqus