LAXO
LAXO

Reputation: 59

Optimizing for() loop in C

my teacher said i could optimize this for() loop, but i can't see how, any help would be great

void vowels(char strng[])
{
int i, j, word_len, vowels_len, vowel_count;
char vowels[] =  "aeiouAEIOU";
word_len = strlen(strng);     
vowels_len = strlen(vowels);
vowel_count = 0;

for (i = 0; i < word_len; ++i) {         
    for (j = 0; j < vowels_len; ++j) {   
       if (strng[i] == vowels[j]) {
           ++vowel_count;
           break;
       }
    }
}

printf("%s: %d vowels\n", strng, vowel_count);    
}

Upvotes: 1

Views: 123

Answers (5)

user3811082
user3811082

Reputation: 251

Fastest thing ever.

void ultravowulator(const char strng[])
{
    char vowels[] = "aeiouAEIOU";
    int vowelscnt = strlen(vowels);    
    int vocabular[256] = {0};   

    for(; *strng != '\0'; strng++) {    
        vocabular[*strng]++;
    }

    int total = 0;
    for (int i = 0; i < vowelscnt; i++){        
        total += vocabular[vowels[i]];
    }
    cout << total << endl;
}

main()
{
    char word[] = "Stack overflow";
    ultravowulator(word);
}

Upvotes: 0

Lundin
Lundin

Reputation: 215115

In order to optimize code, you must first realize what's the bottlenecks. On the algorithm-level, you have the following basic performance problems:

  • You must realize that strlen goes through the whole string in search of a null terminator and therefore you traverse the same string strng twice. This is inefficient - the optimal would be to check for null termination at the same time as you check the data.
  • The literal "aeiouAEIOU" is a constant so you know the size of it at compile-time. No need to calculate this in run-time with strlen(). You could use sizeof instead, which is evaluated at compile-time.
  • By checking for both upper case and lower case, you double the work. Instead, temporarily convert the character you look for to upper case, then only look among upper case vowels for a match.

You also have a major bug, namely that the function doesn't return the result.

The first step of "naive" manual optimization would therefore be something like this:

#include <stdio.h>
#include <ctype.h>
#include <stdbool.h>

int vowels(const char str[])
{
  const char VOWEL_LOOKUP[] = "AEIOU";
  int vowel_count = 0;

  for(; *str != '\0'; str++)
  {
    char ch = toupper(*str);
    for(size_t i=0; i<sizeof(VOWEL_LOOKUP)-1; i++)
    {
      if(ch == VOWEL_LOOKUP[i])
      {
        vowel_count++;
        break;
      }
    }
  }

  return vowel_count;
}


int main (void)
{
  const char str[] = "Stack Overflow"; // 4 vowels 
  printf("%s: %d vowels\n", str, vowels(str));
}

On an advanced level, you would look at the number of branches needed for the algorithm, as these tend to kill performance the most, blocking the CPU's branch prediction.

You could then consider replacing the inner loop with a look-up table. This may or may not improve performance - but at least it will make performance deterministic. And you'll probably start to sacrifice readability at this point, so you shouldn't optimize further unless this is a known bottleneck.

A look-up table version might look something like this evil, not-recommended code, which exploits the fact that symbol tables usually list letters in alphabetic order (EBCDIC did not, so this won't compile on various wildly retarded legacy systems).

int vowels(const char str[])
{
  _Static_assert('Z' - 'A' == 25, "Weird symbol table."); // compile-time sanity check 
  const bool VOWEL_LOOKUP['U'-'A'+1] = // compromise between sacrificing RAM or speed
  {
    ['A'-'A'] = true,
    ['E'-'A'] = true,
    ['I'-'A'] = true,
    ['O'-'A'] = true,
    ['U'-'A'] = true,
  };
  int vowel_count = 0;

  for(; *str != '\0'; str++)
  {
    char ch = toupper(*str);
    if(ch >= 'A' && ch <= 'U') // always 2 branches instead of 5
    { // but comes with a bit of calculation overhead:
      ch -= 'A';
      vowel_count += (int)VOWEL_LOOKUP[(size_t)ch]; 
    }
  }

  return vowel_count;
}

This is not necessarily faster... always benchmark it.

Upvotes: 0

Paul Ogilvie
Paul Ogilvie

Reputation: 25286

An optimization more simplistic than Dasblinkenlight's is:

char vowels[] =  "AEIOU";
vowels_len = sizeof(vowels)-1;
    ...
    for (char c=strng[i]&~32, j = 0; j < vowels_len; ++j) {
        if (c == vowels[j]) {

(c&~32 is toupper in ascii.)

Upvotes: 0

Sergey Kalinichenko
Sergey Kalinichenko

Reputation: 727047

One approach would be to remove the nested loop altogether, replacing it with an array that maps character codes to a flag indicating whether or not the char is a vowel:

int isVowel[256] = {0};
for (int i = 0 ; vowels[i] != '\0' ; i++) {
    isVowel[(unsigned char)vowels[i]] = 1;
}

Now the main loop can be optimized as follows:

for (int i = 0; i != word_len ; i++) {         
    vowel_count += isVowel[(unsigned char)string[i]];
}

You can achieve a comparable performance with a switch statement:

for (int i = 0; i != word_len ; i++) {
    switch(string[i]) {
        case 'a':
        case 'e':
        case 'i':
        case 'o':
        case 'u':
        case 'A':
        case 'E':
        case 'I':
        case 'O':
        case 'U':
            vowel_count ++;
            break;
    }
}

Upvotes: 5

Chris Turner
Chris Turner

Reputation: 8142

Not sure if your teacher would consider this cheating, but this is one possible way to hunt for vowels that would be more optimal than using 2 nested for loops.

char *pos;
int vowel_count=0;

pos=strng;
while(pos)
    {
    pos=strpbrk(pos,"aeiouAEIOU");
    if(pos)
        {  
        vowel_count++;
        pos++;
        }
    }

Upvotes: 0

Related Questions