This is a rather silly question but why is int
commonly used instead of unsigned int
when defining a for loop for an array in C or C++?
for(int i;i<arraySize;i++){}
for(unsigned int i;i<arraySize;i++){}
I recognize the benefits of using int
when doing something other than array indexing and the benefits of an iterator when using C++ containers. Is it just because it does not matter when looping through an array? Or should I avoid it all together and use a different type such as size_t
?
Using int
is more correct from a logical point of view for indexing an array.
unsigned
semantic in C and C++ doesn't really mean "not negative" but it's more like "bitmask" or "modulo integer".
To understand why unsigned
is not a good type for a "non-negative" number please consider
Obviously none of the above phrases make any sense... but it's how C and C++ unsigned
semantic indeed works.
Actually using an unsigned
type for the size of containers is a design mistake of C++ and unfortunately we're now doomed to use this wrong choice forever (for backward compatibility). You may like the name "unsigned" because it's similar to "non-negative" but the name is irrelevant and what counts is the semantic... and unsigned
is very far from "non-negative".
For this reason when coding most loops on vectors my personally preferred form is:
for (int i=0,n=v.size(); i<n; i++) {
...
}
(of course assuming the size of the vector is not changing during the iteration and that I actually need the index in the body as otherwise the for (auto& x : v)...
is better).
This running away from unsigned
as soon as possible and using plain integers has the advantage of avoiding the traps that are a consequence of unsigned size_t
design mistake. For example consider:
// draw lines connecting the dots
for (size_t i=0; i<pts.size()-1; i++) {
drawLine(pts[i], pts[i+1]);
}
the code above will have problems if the pts
vector is empty because pts.size()-1
is a huge nonsense number in that case. Dealing with expressions where a < b-1
is not the same as a+1 < b
even for commonly used values is like dancing in a minefield.
Historically the justification for having size_t
unsigned is for being able to use the extra bit for the values, e.g. being able to have 65535 elements in arrays instead of just 32767 on 16-bit platforms. In my opinion even at that time the extra cost of this wrong semantic choice was not worth the gain (and if 32767 elements are not enough now then 65535 won't be enough for long anyway).
Unsigned values are great and very useful, but NOT for representing container size or for indexes; for size and index regular signed integers work much better because the semantic is what you would expect.
Unsigned values are the ideal type when you need the modulo arithmetic property or when you want to work at the bit level.