Guilherme Lima
Guilherme Lima

Reputation: 55

How can i get the right length of the string?

Why does my function count more characters than expected?

int countLength(char* buffer){
    int cnt = 0;
    for (int i=0; buffer[i] != '\n' && buffer[i] != '\0'; i++){
        cnt++;
    }
    return cnt;
}

For example, if i pass it "Será chuva? Será gente?" as input, it gives 25 instead of 23. why is that?

Upvotes: 1

Views: 76

Answers (1)

Deduplicator
Deduplicator

Reputation: 45694

The code gives you the right answer, even if it is not the answer you expect.

The problem is that you expect it to count graphemes (like á, while it counts bytes / code-units (á consists of two code-units in utf-8 normal form composed).

A first approximation would be to count code-points instead, by skipping continuation-bytes (>0x7f and <0xc0). To actually count graphemes, you would have to use a proper unicode-library with all the character-information like international components for unicode (ICU), and accept their decisions.

Read up on character-sets, especially and the encoding.

As an aside, cnt always mirrors i. While an optimizing compiler will remove this duplication, it shouldn't even be there.

Upvotes: 2

Related Questions