Daniel Näslund
Daniel Näslund

Reputation: 2410

Idiom for handling size_t underflow in loop condition

In C and C++, size_t is an unsigned type that is used for expressing size. It expresses intent and somewhat simplifies range assertions (len < upper_bound vs len >= 0 && len < upper_bound for signed integers).

(In all the examples below len means the length of the array a).

The idiom for a for loop is: for (i = 0; i < len; i++). The idiom for a backward for loop is for (i = len-1; i >= 0; i--). But having unsigned loop indices introduces subtle bugs, and every so often I mess up the edge cases.

First, the backwards for loop. This code underflows for len=0.

for (size_t i = len-1; i >= 0; i--) { // Bad: Underflows for len=0
    use(a[i]);
}

There's the --> "operator" trick, which looks strange if you're not used to it.

for (size_t i = len; i--> 0;) {
    use(a[i]);
}

You can use a signed type for the loop index variable, but that overflows if len > INT_MAX. Many people and organizations considers that risk so minimal that they just stick to int.

for (int i = len-1; i >= 0; i--) {  // BAD: overflows for len < INT_MAX
    use(a[i]);
}

So I've settled for this construct, since it's closest to the canonical for-loop form and has the simplest expressions.

for (size_t i = len; i > 0; i--) {
    size_t pos = i-1;
    use(a[pos]);
}

My problem iterating from 0 to len-1

That is, looping over the range [0, len-1). This loop underflows when len=0.

for (size_t i = 0; i < len-1; i++) {   // BAD: Underflows for len=0.
    use(a[i]);
}

As for the iterating backwards case, you can use signed integers but that may cause overflows.

for (int i = 0; i < len-1; i++) {    // BAD: Will overflow if len > INT_MAX
    use(a[i]);
}

I tend to add another expression to the loop condition, checking for len > 0, but that feels clumsy.

for (size_t i = 0; len > 0 && i < len-1; i++) {
    use(a[i]);
}

I can add an if statement before the loop, but that also feels clumsy.

Is there a less bulky way of writing a for loop with unsigned index variables looping from 0 to len-1?

Upvotes: 3

Views: 1233

Answers (5)

Peter
Peter

Reputation: 36607

There are many possible answers for this question. Which is best is opinion based. I'll offer some options.

For iterating over all elements of an array in reverse order, using an unsigned index, one option is

for (size_t index = 0; index < len; ++index)
{
     size_t i = len - 1 - index;
     use(a[i]);
}

or (more simply)

for (size_t i = 0; i < len; ++i)
{
     use(a[len - 1 - i]);
}

In both cases, if len is zero, the loop body is not executed. Although the loops increment rather than decrement, both access elements in reverse order. If you are frequently writing such loops, it is also not difficult to write a little inline function of the form

size_t index_in_reverse(size_t index, size_t len)
{
     return len - 1 - index;
}

and do

for (size_t i = 0; i < len; ++i)
{
     use(index_in_reverse(i, len));
}

To iterate forward over all elements of the loop in forward order, except the last, I'd make that explicit rather than trying to do it in the loop condition.

if (len > 0)
{
     size_t  shortened_len = len - 1;
     for (size_t i = 0; i < shortened_len; ++i)
         use(a[i]);
}

The reason I introduce the variable shortened_len is to make the code self-documenting about the fact it is not iterating over the entire array. I've seen too many cases where a condition of the form i < len - 1 is "corrected" by a subsequent developer to remove the - 1 because they believe it is a typo.

That may feel "clumsy" to the OP, but I suggest that

for (size_t i = 0; len > 0 && i < len-1; i++) {
   use(a[i]);
} 

is harder for a human to understand (putting multiple tests in a loop condition, forces a person maintaining the code to actually work out what both conditions do and, how they interact). Given a choice between code that is "concise" or code that is "less concise but consumes less brainpower of an unacquainted human to understand" I will ALWAYS choose the latter. Bear in mind that the user maintaining the code six months later may be yourself, and there are few thought processes more humbling than "What idiot wrote this?? Oh, it was me!".

Upvotes: 0

0___________
0___________

Reputation: 67602

I tend to add another expression to the loop condition, checking for len > 0, but that feels clumsy.

Your code should follow the logic. It is not clumsy at all. And it is much more readable for humans.

As the loop makes no sense if the len == 0 and I usually use the if statement. It makes the code easy to understand and maintain.

if(len)
{
    for (size_t i = 0; i < len-1; i++) { /*...*/ }       
}

or you can also add the check in the loop. BTW you only check if it is not zero.

 for (size_t i = 0; len && i < len-1; i++) { /*...*/ }       

Upvotes: 0

Ulrich Eckhardt
Ulrich Eckhardt

Reputation: 17415

In all these cases, you have one common theme: You have an index that has a start value. It gets incremented or decremented until it reaches the end value, which terminates the loop.

One thing that helps here is to make these two values explicit. In the simplest case, forward-iterating the whole range, the end value is simply len:

for (size_t i=0, end=len; i!=end; ++i) {
    ///...
}

This follows the general advise given to people using iterators. In particular, pay attention to the comparison, which works here and which is actually required for some iterators.

Now, backward iterating:

for (size_t i=len-1, end=-1; i!=end; --i) {
    ///...
}

Lastly, iterating a subset excluding the last n elements of the range backwards:

if (len > n) {
    for (size_t i=len-n-1, end=-1; i!=end; --i) {
        ///...
    }
}

Actually, what you were fighting with was your attempt to put too much stuff into the loop logic. Just be explicit that this requires more than n elements in order to do anything at all. Yes, you could put len > n into the loop condition, but that wouldn't give you clear and simple code who's intention anyone understands.

Upvotes: 1

Tarek Dakhran
Tarek Dakhran

Reputation: 2161

There are two cases here.

Iterating forward from 0 to len - 2 inclusive

for (size_t i = 0; i + 1 < len; ++i) {
    size_t index = i;
    // use index here
}

Iterating backward from len - 2 to 0 inclusive

for (size_t i = len; i > 1; --i) {
    size_t index = i - 2;
    // use index here
}

Upvotes: 2

Jan15
Jan15

Reputation: 539

How about

for (size_t i = 0; i+1 < len; i++) {
    use(a[i]);
}

Upvotes: 1

Related Questions