Journal Articles

Overload Journal #29 - Dec 1998 + Programming Topics
Browse in : All > Journals > Overload > 29 (12)
All > Topics > Programming (877)
Any of these categories - All of these categories

Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.

Title: Static vs Member Functions

Author: Administrator

Date: 27 December 1999 17:23:23 +00:00 or Mon, 27 December 1999 17:23:23 +00:00

Summary: 

Body: 

I had the following correspondence with a customer, and, having put some effort into the explanation, it occurred to me to follow your example and recycle it into an article for Overload.

There is a static member function in the example you gave me. I don't really understand the details about how a static function runs under multithreading environments. Would you please tell me the details?

Using static data can be a problem in a multithreaded program, but static functions aren't. Static functions are just a scoping mechanism to prevent polluting the global namespace.

Here's some example code for discussion:

class foo
{
private:
  int my_value;
  // every foo object has one of these
  static int default_value;
  // only one of these in the program
public:
  foo( int new_value )
  { my_value = new_value; }
  foo()
  { my_value = foo::default_value; }
  void change_value( int new_value )
  {my_value = new_value; }

  static void
  set_default_value( int new_value )
  { foo::default_value = new_value; }
  static void 
  change_some_foo(foo * a_foo, int new_value )
  { a_foo -> my_value = new_value; }
};

int foo::default_value = 10;
// must be defined and initialized
//   outside the class
Let's consider the regular case first, of ordinary non-static members. Each instance of the foo class contains an instance of the data member my_value. So if I say
  foo first_foo( 5 );
  foo second_foo( 7 );
  foo third_foo;

I now have three instances of foo objects in my program, each with a different number inside -- 5, 7, and 10. If it helps you visualise this, draw three boxes with the data values inside and label them with the names of the instances.

When you define a variable, you are telling the compiler to reserve a little piece of memory the size of a foo object, and also telling it the name you will use to manipulate that piece of memory.

If I now say

    first_foo.change_value( 13 );

what I am actually doing is invoking some compiler magic. Because this is a (non-static) member function, it has to operate on a specific instance of a foo object, in this case, the one called first_foo. Behind the scenes, the compiler actually changes my call to something like this:

    foo::change_value( &first_foo, 13 );

And it rewrites the function up in the class definition to look like this:

void foo::change_value
  ( foo * const this, int new_value )
{ this -> my_value = new_value; }

The foo:: before the function call means that this function only exists within the context of the foo class - it's meaningless to call change_value() without specifying that it's the function defined in class foo that I want. (There might be a global function called change_value() or other classes might have a change_value() function; all of them can exist in the same program and I call whichever I need with the scoping mechanism.)

On the other hand, the static function foo::change_some_foo() does not have a this pointer added to its parameter list. Often such a function does not refer to any specific foo object; it might as well be a global function declared outside the class (but that would pollute the global namespace and risk name conflicts with functions from other libraries). Similarly, when you declare a static data member, it means there is only one instance of that variable in the whole program, and you refer to it by using the class scope operator (like foo::default_value) to prevent name clashes.

To get off topic for a minute, in recent years there was an increasing trend in C++ programming to package up otherwise-global data and functions into artificial classes, simply to avoid name conflicts. Imagine that you bought two or three different libraries and they each contained a global variable called Library_Version_Number; you couldn't use them together without changing the vendors' source code. It was this trend that led to the introduction of namespaces into the language.

Apart from the issue of name conflicts, packaging static data and functions into a class helps to show that they are related to the purpose of the class, as with foo::default_value above. It's better to keep everything together than to create a dependency on a global variable (if only because it simplifies the documentation effort.)

Getting back to the static member function change_a_foo(), if it needs to access a specific foo instance, you have to tell it which one to look at - hence the a_foo parameter. Because the static function is a member of the class, it has access to the private parts of that class. If it were a global function, I would have had to make it a friend of the class, creating another dependency that makes it harder to reuse the class.

When it comes to multithreading, there is another reason for using static functions. You can't tell the thread to run a particular member function on a particular object, because there is no way to signal the compiler to add the invisible this pointer (remember the underlying systems threads library is written in C and doesn't understand OO programming). So I told the thread to run a static member function and passed it the address of the instance to operate on. If I had several member functions I might want to invoke, I would have written several static functions to correspond to them.

Finally, consider the static data member foo::default_value. There is only one of these in the whole program. If there is any chance that multiple threads might call set_default_value() at the same time, or that one of them might change it while some other thread is reading it, then I would need to protect foo::default_value with a mutex or some other mechanism.

I've another question. It seems that both change_value() and change_some_foo() can work well. change_value() gets an object pointer added and passed over to it implicitly and change_some_foo() gets an object pointer explicitly. The question arises which one should be used in the class design ?

You're right - both functions do exactly the same thing. I did sort of wonder if (1) it was confusing to send them to you that way, and (2) if I did so, if it would be better design for the static function to then call the member function on its parameter instead of tinkering with the data directly. Yes, it would, of course, but I wanted to show that the static member function did have access to the private data of an instance.

In general, I think the best design would be to prefer instance member functions over static functions whenever possible; that fits better with the OO paradigm of encapsulating data and behaviour together.

Static functions are best used to bundle some kind of functionality into the class which is logically related to the class's purpose, but not part of the behaviour of any instance. An example of this would be the set_default() function - this is a value that every instance may have occasion to need, but which doesn't belong to any particular instance. Now that I think about it more, here is a better implementation than using a static member of the class to hold the default value - make it a static variable within the function itself:

class foo:
{
public:
  int default( int new_value =
            SOMETHING_INVALID );
  // other stuff as before
};

int foo::default()
{
  static bool initialized;
  static int default_value;

  if ( !initialized )
  {
    if ( new_value == SOMETHING_INVALID)
        default_value = 10;
    else 
        default_value = new_value;
    initialized = true;
  }
  return default_value;
}

When you need the default value to initialise a foo object, you would get it with the simple call foo::default() - the first time you do this, if the static variable hadn't been previously initialised, it would be done at this time. Or you could set it to a different value with foo::default( 20 ). This of course assumes that there is some invalid value that could be used to signal that you just want the existing default.

What are the advantages of this approach?

First, initialization only takes place if you actually need the default value - if you never ask for it, there's no overhead. If the initialisation was expensive (maybe going over the network to get the value?), doing it routinely at the start of every program would slow things down.

Second, and this is really serious, when multiple object files are linked together to form a program, the order of initialisation of static variables is unspecified. So in the previous message, this line:

/*static*/ int foo::default_value = 10;

might not be properly initialised when you needed to access it from another module. This phenomenon is entirely at the whim of the linker, so it may or may not bite you.

Third, access to the function-static variable is available only by calling that function. Any class method has access to the static data members of the class, so more functions need to co-operate with the locking mechanism, etc.

To get back to your original question, is there any time when you might want to have a member function and a static function with basically the same functionality? Occasionally - and here's one example:

class enhanced_string
{
public:
  void uppercase();
  // uppervase my instance data
  static enhanced_string uppercase
    (enhanced_string  const & s)
  {
    enhanced_string temp = s;
    temp.uppercase();
    return temp;
  }
  // other stuff
};

So, if you wanted any individual enhanced_string to change itself to the uppercase representation, you would call its member function. To acquire a new enhanced_string, which was an uppercased copy of an existing string, you would call the static function and pass it the string you wanted to copy. The advantage is that you don't have to 'fatten' the interface by coming up with two different names for member functions to perform the different functionality.

In summary, prefer member functions to static functions if they make any sense at all.

Notes: 

More fields may be available via dynamicdata ..