Journal Articles

CVu Journal Vol 11, #1 - Nov 1998

Browse in :

All > Journals > CVu > 111 (19)

Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.

Title: Getting Started

Author: Administrator

Date: 03 November 1998 13:15:28 +00:00 or Tue, 03 November 1998 13:15:28 +00:00

Summary:

Body:

One of the problems that newcomers to programming have is in sorting out their thoughts about a problem. Often the problem is an exercise from a book or a tutor. These problems have a habit of being very poorly specified. Initial specifications are often (usually) deficient even in professional environments. Programmers need to learn to refine initial specifications. A way of doing this is by having a dialogue (in a professional environment, with your client).

The problem that faces the newcomer is that it seems hard to hold a dialogue with a book. However learning to interrogate the question is a vital problem solving skill. I offer you the following as an example.

Initial Programming Exercise:

Write a program that will count the occurrences of characters in a string and print out the most frequently used and the least frequently used (that has been used at least once). Also print out the frequency for these items.

The following dialogue is between the student and a mentor (that mentor might in fact be the students alter ego, indeed for long term progress it needs to be). I have distinguished between student and mentor by the use of different typefaces.

Student

Where do I start?

Mentor

The beginning would seem a good place. Think about what constitutes a string.

Student

Letters? Numbers? Punctuation?

Mentor

Yes. What are all those things called collectively?

Student

Symbols?

Mentor

But what represents a symbol in C?

Student

A char.

Mentor

Good, but there are several kinds of char. Which should we use?

Student

I know that there are signed and unsigned ones but how do I choose?

Mentor

Do you think the concept of a negative character makes sense?

Student

No. So I guess I should choose the unsigned one.

Mentor

How many different characters might there be in the string?

Student

I have no idea.

Mentor

Yes you have. Think about unsigned char. How many different symbols can it represent?

Student

It all depends

Mentor

On what?

Student

Which codes make sense.

Mentor

OK. Let us assume they all do. What is the largest number? Think about limits.

Student

You mean check limits.h? … UCHAR_MAX.

Mentor

Almost, but that is the value of the largest. Think about the value of the smallest.

Student

I see. We are counting from zero. So at most there may be UCHAR_MAX+1 different unsigned chars.

Mentor

Good. So we will need to be able to keep counts for up to that number of characters. Does that give you any ideas?

Student

What about an array of counters? Something like int counters[UCHAR_MAX+1].

Mentor

Why an int array? Think about the nature of a counter. What is the least value it can have?

Student

You mean it would be better as:

unsigned int counters[UCHAR_MAX + 1];

Mentor

Yes. But where are we going to put that definition and how should it be initialised?

Student

What is wrong with it as it is?

Mentor

Nothing until someone else uses your code, or until you recycle it yourself in a few months time. As it is it is a global variable. Despite all you may have been taught, global variables are very bad news in modern code where several processes (threads) may be running at the same time.

Student

OK. Then it will have to be in a function. What about:

int main(void){
  unsigned int counters[UCHAR_MAX + 1];
}

Mentor

Well that is a start. But you need to initialise your counters. Check out the rules and you will find that you can do better than write a loop.

Student

Yes, I remember that we can brace initialise an array to zeroes.

Mentor

Good. Now where is that string coming from?

Student

I do not know. But I could delay the decision with a function. Something like:

char * getString();

and use that for a data item in main()

char * data_string = getString();

Mentor

Excellent. Make a note that we will have to write getString() and make sure that it returns a pointer to a non-local object.

For the next part we will have to tie down the problem a little further. What sort of characters are we going to count? All of them or just letters?

Student

I think that just letters was intended.

Mentor

OK. But are we to distinguish upper and lower case?

Student

Let us do it the hard way and say no.

Mentor

Now we need to do some research. How do we identify letters? Be careful because we do not want to miss the accented ones. And how do we collect corresponding upper and lower case ones together. Go and check out the Standard Library.

Student

…OK, I think I have found what we need. isalpha() will identify letters and tolower() will provide the code for lowercase versions of uppercase letters.

Mentor

Time to write a little more code. Write code to count the letters from your string:

Student

Like this:

int main(void){
  unsigned int counters[UCHAR_MAX + 1] = {0};
  char * data_string = getString();
  unsigned int iter;
  for(  iter = 0; 
      iter <  strlen(data_string);
       ++iter){
    if (isalpha(data_string[iter]) {
      ++counters[tolower(data_string)];
    }
  }
  return 0;
}

Mentor

Good. You will need some header files, you will need to provide a definition of getString(). I like that simple use of strlen() rather than trying to be clever. I also note that someone has had the sense to tell you to pre-increment for preference (that will work better if you write C++ code)

It is time to consider finding the least often used letter. Think about what the largest value of this could be under the worst circumstances (always a good idea to think about boundary conditions).

Student

The string might contain a single letter repeated. So the largest possible value for the count of the least used letter (that is a mouthful☺) is strlen(data_string);

Mentor

Exactly. Now just iterate through the counters lowering the limit each time you find a lower but non-zero count.

Student

Like:

unsigned int least=strlen(data_string);
for(iter=0; iter<UCHAR_MAX+1; ++iter){
  if(counters[iter])
    if(counters[iter]<least) {
      least = counters[iter];
    }
  }
}

And I could do the opposite for finding the most frequent.

unsigned int most = 0;
for(iter=0; iter<UCHAR_MAX+1; ++iter){
  if(counters[iter])
    if(counters[iter]<most){
       most = counters[iter];
    }
  }
}

Mentor

True. What would it mean if least stays as strlen(data_string)?

Student

That the string was all one letter.

Mentor

Really? Any other possibility? What if if(counters[iter]) always fails?

Student

I see. If there were no letters in the string then the value of least would never be changed. But neither would the value of most. Hey that is useful. The only way that the value of most could be zero is if there were no letters in the string. Perhaps we should count those first.

Mentor

Fine. Now see if you can go away and write your complete program. It never said in the original spec but I think you should print out all letters that are least frequent and all that are most frequent.

The Final Cut

#include <limits.h>
#include <stdio.h>
#include <ctype.h>
#include <string.h>
char const * getString();
 int main(void){
  unsigned int counters[UCHAR_MAX + 1] = {0};
  char * data_string = getString();
  unsigned int iter;
  for(iter = 0; iter <  strlen(data_string); ++iter){
    if (isalpha(data_string[iter]) {
      ++counters[tolower(data_string)];
    }
  }
  {  /* block to allow timely definitions of locals */
    unsigned int most = 0;
    unsigned int least = strlen(data_string);
    for(iter=0; iter<UCHAR_MAX+1; ++iter){
      if(counters[iter])
        if(counters[iter]<most) most 
                    = counters[iter];
      }
    }
    if(most == 0){
      puts("There are no letters in the string");
      return 0;      /* note early return */
    }
    for(iter=0; iter<UCHAR_MAX+1; ++iter){
      if(counters[iter])
        if(counters[iter]<least) least  
                    = counters[iter];
    }
    puts("The least used letters are:");
    for(iter=0; iter<UCHAR_MAX+1; ++iter){
      if(counters[iter] == least){
        printf("%c, ", iter);
      }
    }
    printf("\nEach used %d times.\n", least);
    puts("The most used letters are:");
    for(iter=0; iter<UCHAR_MAX+1; ++iter){
      if(counters[iter] == most){
        printf("%c, ", iter);
      }
    }
    printf("\nEach used %d times.\n", most);
  }
  return 0;
}

Of course there is still room for polishing. Ideally I would like to see code extracted into functions where it can be reused and otherwise maintained. But my purpose with this article was to try to help those struggling to see how writing a program is a stepwise process. Trying to tackle it all at once is far to daunting. As you gain experience you work in larger steps. Some of the steps above are large for the true novice but I have to give some consideration to space and the boredom factor.

Remember that the biggest difference between any novice and a Chess Grandmaster is the complexity of each step in their analysis of a position.

Notes:

More fields may be available via dynamicdata ..