Journal Articles

CVu Journal Vol 28, #4 - September 2016 + Programming Topics
Browse in : All > Journals > CVu > 284 (10)
All > Topics > Programming (877)
Any of these categories - All of these categories

Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.

Title: Random Confusion

Author: Martin Moene

Date: 05 September 2016 16:35:15 +01:00 or Mon, 05 September 2016 16:35:15 +01:00

Summary: Silas S. Brown tries to clear up a muddle about Standard C’s rand().

Body: 

The Standard C way of obtaining random numbers is to call srand() and rand(), defined in stdlib.h. Unfortunately, the manual pages of BSD state categorically that rand() is not very good, and a function called random() should be used in its place. This includes Mac OS X (which was derived from BSD), and it would appear that some poorly-written books have copied this advice without realising that random() is not actually Standard C. Searching the Web will show quite a few discussion pages where beginners have read books that told them to use random() instead of rand(), and others have told them this is not Standard C and the book is not very good, but they rarely note where the confusion came from. (I would chime in myself on those threads if it weren’t so much hassle to sign up for an account on each one.)

So perhaps we should make a little public service announcement: rand() is Standard C; random() is Standard POSIX.

True, random()’s being in POSIX means it’s available on all modern Unix systems including BSD, Mac OS X and GNU/Linux. But it’s not there on Windows and the confusion is (for once) not Microsoft’s fault. It’s also unlikely to be there on other platforms (PDAs and so on).

When the BSD manual says rand() is not very good, what it really means is, BSD’s implementation of it is not very good. The C standard gives a sample implementation of rand but does not mandate that particular implementation. BSD (at least the version of it in Apple’s source) essentially just multiplies the previous number by 16807 and leaves it at that. In the early 1980s, 4.2BSD (1983) and 4.3BSD (1986) introduced random() to be a ‘better rand’, but they didn’t upgrade the behaviour of their rand(), presumably because they didn’t want to break any programs which relied on their old implementation-specific behaviour of this function.

The GNU/Linux C library (at least nowadays) does the sensible thing and makes rand() equal to random(), so you get the better behaviour no matter which one you call. In that case I suggest you call rand() because then you’re writing Standard C. But their manual page suggests you prefer random() because then your code will work better on BSD.

The Windows implementation of rand() is better than BSD’s: it multiplies by 214013, adds 2531011 and returns bits 30 to 16 of the result. Of course, that’s not good enough for cryptography (and you certainly shouldn’t rely on the standard rand() being good enough for cryptography on any platform; if you want cryptographically-strong random numbers, make sure you can find out how to get them properly!), but it should be fine for most scientific or simulation purposes and you don’t have to worry about disregarding the low-order bits (which is just as well, as there are only 15 bits to start with).

(Versions of WINE prior to 2006 just did rand() & 0x7fff, but this was then replaced with an implementation of what MSVC actually does. So if you’re on Unix and cross-compiling for Windows, you can run your program in WINE and get the same random sequence that it would get on Windows if seeded to the same value.)

So what to do about random()? A first thought might be ‘use rand() on Windows, random() everywhere else’ but that won’t help with ports to non-Windows non-Unix systems (yes these do exist). Since GNU/Linux nowadays gives you good behaviour no matter which one you call, it would be better to say ‘use random() on BSD, rand() everywhere else’. You can do that by checking for the BSD macro, which is defined in sys/param.h (a header file which is also available when cross-compiling for Windows) – see Listing 1.

#include <stdlib.h>
#include <sys/param.h>
#ifdef BSD
/* avoid BSD's bad old rand() implementation */
#define rand random
#define srand srandom
#endif /* BSD */
			
Listing 1

If you want to make your program’s behaviour 100% reproducible no matter which system it is compiled on, you had better provide your own random number generator, because you never know which one you’re going to get from the standard library. Seeding it to the same value makes the sequence reproducible only on that particular implementation of the C library (and that particular version of it at that: most platforms are not as obsessive as BSD is about keeping their old behaviour from version to version). The code in Listing 2 should reproduce Microsoft’s behaviour; I’m setting the functions to static so they won’t be visible outside the current C module, just in case some library function depends on the library’s implementation of rand(). Feel free to use this in your code as it seems to be a public-domain method (at least I hope so; if Microsoft sues you for using two of their numbers, change them! but do check some good texts for alternatives, as you’re less likely to end up with a good generator if you just pick them out of thin air yourself.)

#ifdef RAND_MAX
#undef RAND_MAX
#endif
#define RAND_MAX 0x7FFF
static unsigned long seed = 1;
static void srand(unsigned int newSeed) {
   seed = newSeed;
}
static int rand() {
   seed = (seed*214013L) + 2531011L;
   return (seed >> 16) & 0x7FFF;
}
			
Listing 2

The XorShift generator discovered by George Marsaglia in 2003 gives generally better random numbers and is usually faster. It can be implemented thus:

  #include <stdint.h>
  int XS_rand() {
    static uint64_t s=1;
    s^=s>>12; s^=s<<25; s^=s>>27;
    return s & 0x7FFFFFFF;
  }

On x86 processors, bit shifts and XORs can be done in one cycle each, whereas multiplications can take 3 or 4 cycles depending on the processor (although pipelining and other types of instruction-level parallelism can hide that latency in some circumstances). Even additions can take multiple cycles on some CPUs. Thus Xorshift (which takes three XORs and three shifts) is almost certainly going to be at least as fast, and likely faster, than a common linear congruential generator which uses multiplication and addition.

Notes: 

More fields may be available via dynamicdata ..