hipyhop
hipyhop

Reputation: 179

Duplicate character skipping in c++ string processing

I'm writing a high performance function that needs to process a string (char *).

These strings are often very long but contain duplicate characters which have no effect once the character has been processed.

I've implemented an std::set to store the processed characters and check the character is not in the set before processing.

Is there a more efficient method you can think of?

Thanks

SOLUTION:

I went for a bool array.

bool b[256] = {0};
...
if(!b[*ci]){
  b[*ci]=true;
  ...
} 

Thanks for the help!

Upvotes: 1

Views: 351

Answers (3)

João Augusto
João Augusto

Reputation: 2305

unsigned char cCheck[256];

void Process(const char* p_cInput)
{
    memset(cCheck, 0, 256);
    while(*p_cInput != '\0')
    {
        if(cCheck[*p_cInput] == 0)
            cCheck[*p_cInput] = 1;
        else
        {
            // We done
            break;
        }

        p_cInput ++;
    }
}

Upvotes: 3

Matt
Matt

Reputation: 7160

You need a 256bit (32 byte) list that is initialised to 0, and then you set the bits as you see a character. The easiest way to make that data type would be to split it into 4 lots of 8 byte integers, and then you can check the range of the character to see which int to check/write to.

Upvotes: 1

Mario The Spoon
Mario The Spoon

Reputation: 4859

just an array that is as long as the number of characters and tick off the char within the array.

Upvotes: 5

Related Questions