Programming Topics + CVu Journal Vol 11, #5 - Aug 1999
Browse in : All > Topics > Programming
All > Journals > CVu > 115
Any of these categories - All of these categories

Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.

Title: Compile Time Assertions in C Revisited

Author: Administrator

Date: 03 August 1999 13:15:32 +01:00 or Tue, 03 August 1999 13:15:32 +01:00

Summary: 

Body: 

I am pleased that my article on compile time assertions has provoked some feedback. To start off I'll paste in the final code from my C Vu 11.3 article.

#include <limits.h>

#define COMPILE_TIME_ASSERT(pred)       switch(0){case 0:case pred:;}

#define ASSERT_MIN_BITSIZE(type,size)        \
COMPILE_TIME_ASSERT(sizeof(type) * CHAR_BIT >= size)

#define ASSERT_EXACT_BITSIZE(type,size)       \
  COMPILE_TIME_ASSERT(sizeof(type) * CHAR_BIT == size)
void compile_time_assertions(void) {
  ASSERT_MIN_BITSIZE(char,  8);
  ASSERT_MIN_BITSIZE(int,  16);
  ASSERT_EXACT_BITSIZE(long, 32);
}

A number of readers criticised the above code. Too complicated they cried. There are two ways to interpret such remarks. The first is that the technique in general is too complicated. I have a lot of sympathy with this point of view. C was not designed with compile time assertions in mind. You might reject compile time assertions altogether and opt for a more conventional runtime solution. As Kevlin Henney pointed out in his letter to the editor (C Vu 11.4), pre-processor based compilation errors are, to say the least, less than helpful.

The second interpretation is that the technique is too complicated and there are simpler techniques for implementing compile time assertions. Strange as it may seem I also have a lot of sympathy for this view. Again, whichever way you try to skin it, this particular cat is tricky. C was not designed with compile time assertions in mind. Compile time assertions are right on the edge - it's highly likely that individual preferences will come into play. However, I stand by my implementation of compile time assertions as presented above. Well, almost. If ever a piece of code cried out for a comment, this is it. But let me be specific: I do not think that the function compile_time_assert needs any comments (although in hindsight I think type_size_constraints would be a better name). Nor do I think that the ASSERT_MIN_BITSIZE macro needs a comment. ASSERT_EXACT_BITSIZE is fine as it stands too, imho. No, the comment is needed for the COMPILE_TIME_ASSERT macro. This is decidedly not obvious. What should the comment for COMPILE_TIME_ASSERT say? The text of the previous article would be a good place to start.

Arithmetic in the Preprocessor?

In C Vu 11.4 Silas suggested an alternative technique for asserting a compile time requirement.

#include <limits.h>
#if CHAR_BIT < 8
#error This program assumes a compiler that supports 8-bit chars
#endif

#define byte1(c)     ((c) & 0xFF)
#define byte2(c)     byte1((c) >> 8)
#define byte3up(c)   ((c) >> 16)
#define TwoByteFail(c,min2,min1)  \
        (!byte3up(c) && byte2(c) < min2 || byte2c) < min1)

#if TwoByteFail(USHRT_MAX, 0xFF, 0xFF)
#error This program assumes a compiler that supports 16-bit shorts
#endif

I assume that Silas meant to write #if CHAR_BIT != 8. If CHAR_BIT is less than eight than your compiler is non-conforming![1] The C standard states that CHAR_BIT is an expression (usable in a #if directive) that is greater than or equal to eight. There is also the style issue that function macros are conventionally uppercase.

Here be dragons

The fundamental problem, as I see it, with this approach is that it uses constant-expressions in a #if directive. This raises a question: what are the types of the constants used in the constant-expressions? This matters because there are shift operators. If USHRT_MAX is an unsigned integer then the right-shift will introduce 0's in the most-significant bit. If USHRT_MAX is a signed integer then the right-shift might introduce 1's in the most-significant bit.[2]

The type of a literal constant will depend on the platform. For unsigned short for example, the effect of the promotion rules is to favour promoting to int wherever possible, but to promote to unsigned int if necessary to preserve the original value in all possible cases. This is bad news because the macros are designed to test type sizes.

But hang on. The promotion rules of C are the promotion rules of C. The preprocessor is not C. Look carefully at the C standard and you'll find that inside a #if directive the promotion rules are different! The pre-processor is obliged to promote to long instead of int and to unsigned long instead of unsigned int. It's highly likely that the type of USHRT_MAX will be a signed long.

And it's potentially even worse than that!! If you're cross compiling the pre-processor won't even be running on the target computer. For all these reasons I think a solution that avoids #if directives is well-advised.

Do it yourself

To return to Kevlin's letter for a moment. He writes "pre-processor symbols lose their identity in the process of compilation, which leads to succinct but useless messages..." Can compile time assertions be achieved without the pre-processor? Of course - do the expansion yourself.

#include <limits.h>

void type_size_constraints(void) {
  switch(0){case 0: case CHAR_BIT == 8:;}
  switch(0){case 0: case sizeof(int)*CHAR_BIT >= 16:;}
  switch(0){case 0: case sizeof(long)*CHAR_BIT == 32:;}
}

With the compile time mechanism out in the open the game has changed slightly. There's a lot to be said for using one of the other techniques. For example

#include <limits.h>
char require_char_exactly_8_bits[CHAR_BIT == 8];
char require_int_at_least_16_bits[sizeof(int)  * CHAR_BIT >= 16];
char require_long_exactly_32_bits[sizeof(long) * CHAR_BIT == 32];

This does not require an enclosing function. We can name the arrays to document the constraint. Each array will only be a single character long. There's a good chance we'll get fewer compiler warnings (than the switch solution). And we won't be affected by the changes proposed in the new C standard (which Francis hinted at in his editor's comment). It occurs to me that we can test the type sizes in terms of bytes rather than bits. For example

#include <limits.h>
char require_8_bit_byte[CHAR_BIT == 8];
char require_2_byte_int[sizeof(int) == 2]; 
char require_4_byte_long[sizeof(long) == 4];

These are all definitions. You would not be able to put them in a header file. To do that we need declarations. There are a number of possibilities.

#include <limits.h>
/* 1. extern array declaration */
extern char require_8_bit_byte[CHAR_BIT == 8];
/* 2. function prototype with array parameter */
void require_8_bit_byte(char[CHAR_BIT == 8]);
/* 3. array typedef */
typedef char require_8_bit_byte[CHAR_BIT == 8];
/* 4. struct with array member */
struct type_size_constraints {
    char require_8_bit_byte[CHAR_BIT == 8];
    char require_2_byte_int[sizeof(int) == 2];
    char require_4_byte_long[sizeof(long) == 4];
};

The last approach has the benefit of providing a grouping mechanism for all compile time constraints in a file. However, this might fall foul of compilers extended to allow zero sized arrays as a last struct member. To get round this we can provide a dummy last member.

struct type_size_constraints {
    char require_8_bit_byte [CHAR_BIT == 8];
    char require_2_byte_int [sizeof(int) == 2];
    char require_4_byte_long[sizeof(long) == 4];
    char non_empty_dummy_last_member[1];
};

This seems to be getting close to a distilled essence. Still with a chunky comment of course.

Thanks to Kevlin for advice and comments.

That's all for now

Cheers

Bibliography

Quote from Gerald Weinberg in The Psychology of Computer Programming. "The programming business relies more than any other on unending learning".



[1] There was a superfluous assertion in the previous article...

  ASSERT_MIN_SIZE(char,8);

[2] Notice that USHRT_MAX cannot be a literal of type short. There is the L suffix for long literals but there is no S suffix for short literals, presumably because of the default short-to-int promotion in K&R C. Notice also, in E1 >> E2, if E1 has a signed type and a negative value, the resulting value is implementation-defined.

Notes: 

More fields may be available via dynamicdata ..