Reputation: 2410
In C and C++, size_t
is an unsigned type that is used for expressing size. It expresses intent and somewhat simplifies range assertions (len < upper_bound
vs len >= 0 && len < upper_bound
for signed integers).
(In all the examples below len
means the length of the array a
).
The idiom for a for loop is: for (i = 0; i < len; i++)
. The idiom for a backward for loop is for (i = len-1; i >= 0; i--)
. But having unsigned loop indices introduces subtle bugs, and every so often I mess up the edge cases.
First, the backwards for loop. This code underflows for len=0.
for (size_t i = len-1; i >= 0; i--) { // Bad: Underflows for len=0
use(a[i]);
}
There's the -->
"operator" trick, which looks strange if you're not used to it.
for (size_t i = len; i--> 0;) {
use(a[i]);
}
You can use a signed type for the loop index variable, but that overflows if len > INT_MAX
. Many people and organizations considers that risk so minimal that they just stick to int
.
for (int i = len-1; i >= 0; i--) { // BAD: overflows for len < INT_MAX
use(a[i]);
}
So I've settled for this construct, since it's closest to the canonical for-loop form and has the simplest expressions.
for (size_t i = len; i > 0; i--) {
size_t pos = i-1;
use(a[pos]);
}
0
to len-1
That is, looping over the range [0, len-1). This loop underflows when len=0
.
for (size_t i = 0; i < len-1; i++) { // BAD: Underflows for len=0.
use(a[i]);
}
As for the iterating backwards case, you can use signed integers but that may cause overflows.
for (int i = 0; i < len-1; i++) { // BAD: Will overflow if len > INT_MAX
use(a[i]);
}
I tend to add another expression to the loop condition, checking for len > 0
, but that feels clumsy.
for (size_t i = 0; len > 0 && i < len-1; i++) {
use(a[i]);
}
I can add an if statement before the loop, but that also feels clumsy.
Is there a less bulky way of writing a for loop with unsigned index variables looping from 0 to len-1?
Upvotes: 3
Views: 1233
Reputation: 36607
There are many possible answers for this question. Which is best is opinion based. I'll offer some options.
For iterating over all elements of an array in reverse order, using an unsigned index, one option is
for (size_t index = 0; index < len; ++index)
{
size_t i = len - 1 - index;
use(a[i]);
}
or (more simply)
for (size_t i = 0; i < len; ++i)
{
use(a[len - 1 - i]);
}
In both cases, if len
is zero, the loop body is not executed. Although the loops increment rather than decrement, both access elements in reverse order. If you are frequently writing such loops, it is also not difficult to write a little inline function of the form
size_t index_in_reverse(size_t index, size_t len)
{
return len - 1 - index;
}
and do
for (size_t i = 0; i < len; ++i)
{
use(index_in_reverse(i, len));
}
To iterate forward over all elements of the loop in forward order, except the last, I'd make that explicit rather than trying to do it in the loop condition.
if (len > 0)
{
size_t shortened_len = len - 1;
for (size_t i = 0; i < shortened_len; ++i)
use(a[i]);
}
The reason I introduce the variable shortened_len
is to make the code self-documenting about the fact it is not iterating over the entire array. I've seen too many cases where a condition of the form i < len - 1
is "corrected" by a subsequent developer to remove the - 1
because they believe it is a typo.
That may feel "clumsy" to the OP, but I suggest that
for (size_t i = 0; len > 0 && i < len-1; i++) { use(a[i]); }
is harder for a human to understand (putting multiple tests in a loop condition, forces a person maintaining the code to actually work out what both conditions do and, how they interact). Given a choice between code that is "concise" or code that is "less concise but consumes less brainpower of an unacquainted human to understand" I will ALWAYS choose the latter. Bear in mind that the user maintaining the code six months later may be yourself, and there are few thought processes more humbling than "What idiot wrote this?? Oh, it was me!".
Upvotes: 0
Reputation: 67602
I tend to add another expression to the loop condition, checking for len > 0, but that feels clumsy.
Your code should follow the logic. It is not clumsy at all. And it is much more readable for humans.
As the loop makes no sense if the len == 0
and I usually use the if statement. It makes the code easy to understand and maintain.
if(len)
{
for (size_t i = 0; i < len-1; i++) { /*...*/ }
}
or you can also add the check in the loop. BTW you only check if it is not zero.
for (size_t i = 0; len && i < len-1; i++) { /*...*/ }
Upvotes: 0
Reputation: 17415
In all these cases, you have one common theme: You have an index that has a start value. It gets incremented or decremented until it reaches the end value, which terminates the loop.
One thing that helps here is to make these two values explicit. In the simplest case, forward-iterating the whole range, the end value is simply len
:
for (size_t i=0, end=len; i!=end; ++i) {
///...
}
This follows the general advise given to people using iterators. In particular, pay attention to the comparison, which works here and which is actually required for some iterators.
Now, backward iterating:
for (size_t i=len-1, end=-1; i!=end; --i) {
///...
}
Lastly, iterating a subset excluding the last n
elements of the range backwards:
if (len > n) {
for (size_t i=len-n-1, end=-1; i!=end; --i) {
///...
}
}
Actually, what you were fighting with was your attempt to put too much stuff into the loop logic. Just be explicit that this requires more than n
elements in order to do anything at all. Yes, you could put len > n
into the loop condition, but that wouldn't give you clear and simple code who's intention anyone understands.
Upvotes: 1
Reputation: 2161
There are two cases here.
Iterating forward from 0
to len - 2
inclusive
for (size_t i = 0; i + 1 < len; ++i) {
size_t index = i;
// use index here
}
Iterating backward from len - 2
to 0
inclusive
for (size_t i = len; i > 1; --i) {
size_t index = i - 2;
// use index here
}
Upvotes: 2