int vs size_t on 64bit

MK. picture MK. · Mar 25, 2010 · Viewed 19.1k times · Source

Porting code from 32bit to 64bit. Lots of places with

int len = strlen(pstr);

These all generate warnings now because strlen() returns size_t which is 64bit and int is still 32bit. So I've been replacing them with

size_t len = strlen(pstr);

But I just realized that this is not safe, as size_t is unsigned and it can be treated as signed by the code (I actually ran into one case where it caused a problem, thank you, unit tests!).

Blindly casting strlen return to (int) feels dirty. Or maybe it shouldn't?
So the question is: is there an elegant solution for this? I probably have a thousand lines of code like that in the codebase; I can't manually check each one of them and the test coverage is currently somewhere between 0.01 and 0.001%.

Answer

mloskot picture mloskot · May 28, 2010

Some time ago I posted a short note about this kind of issues on my blog and the short answer is:

Always use proper C++ integer types

Long answer: When programming in C++, it’s a good idea to use proper integer types relevant to particular context. A little bit of strictness always pays back. It’s not uncommon to see a tendency to ignore the integral types defined as specific to standard containers, namely size_type. It’s available for number of standard container like std::string or std::vector. Such ignorance may get its revenge easily.

Below is a simple example of incorrectly used type to catch result of std::string::find function. I’m quite sure that many would expect there is nothing wrong with the unsigned int here. But, actually this is just a bug. I run Linux on 64-bit architecture and when I compile this program as is, it works as expected. However, when I replace the string in line 1 with abc, it still works but not as expected :-)

#include <iostream>
#include <string>
using namespace std;
int main()
{
  string s = "a:b:c"; // "abc" [1]
  char delim = ':';
  unsigned int pos = s.find(delim);
  if(string::npos != pos)
  {
    cout << delim << " found in " << s << endl;
  }
}

Fix is very simply. Just replace unsigned int with std::string::size_type. The problem could be avoided if somebody who wrote this program took care of use of correct type. Not to mention that the program would be portable straight away.

I’ve seen this kind of issues quite many times, especially in code written by former C programmers who do not like to wear the muzzle of strictness the C++ types system enforces and requires. The example above is a trivial one, but I believe it presents the root of the problem well.

I recommend brilliant article 64-bit development written by Andrey Karpov where you can find a lot more on the subject.