C++: Vector bounds

lekroif picture lekroif · Dec 24, 2012 · Viewed 12k times · Source

I am coming from Java and learning C++ in the moment. I am using Stroustrup's Progamming Principles and Practice of Using C++. I am working with vectors now. On page 117 he says that accessing a non-existant element of a vector will cause a runtime error (same in Java, index out of bounds). I am using the MinGW compiler and when I compile and run this code:

#include <iostream>
#include <cstdio>
#include <vector>

int main() 
{ 
    std::vector<int> v(6);
    v[8] = 10;
    std::cout << v[8];
    return 0;
}

It gives me as output 10. Even more interesting is that if I do not modify the non-existent vector element (I just print it expecting a runtime error or at least a default value) it prints some large integers. So... is Stroustrup wrong, or does GCC have some strange ways of compiling C++?

Answer

Kerrek SB picture Kerrek SB · Dec 24, 2012

The book is a bit vague. It's not as much a "runtime error" as it is undefined behaviour which manifests at runtime. This means that anything could happen. But the error is strictly with you, not with the program execution, and it is in fact impossible and non sensible to even talk about the execution of a program with undefined behaviour.

There is nothing in C++ that protects you against programming errors, quite unlike in Java.


As @sftrabbit says, std::vector has an alternative interface, .at(), which always gives a correct program (though it may throw exceptions), and consequently one which one can reason about.


Let me repeat the point with an example, because I believe this is an important fundamental aspect of C++. Suppose we're reading an integer from the user:

int read_int()
{
    std::cout << "Please enter a number: ";
    int n;
    return (std::cin >> n) ? n : 18;
}

Now consider the following three programs:

The dangerous one: The correctness of this program depends on the user input! It is not necessarily incorrect, but it is unsafe (to the point where I would call it broken).

int main()
{
    int n = read_int();
    int k = read_int();
    std::vector<int> v(n);
    return v[k];
}

Unconditionally correct: No matter what the user enters, we know how this program behaves.

int main() try
{
    int n = read_int();
    int k = read_int();
    std::vector<int> v(n);
    return v.at(k);
}
catch (...)
{
    return 0;
}

The sane one: The above version with .at() is awkward. Better to check and provide feedback. Because we perform dynamic checking, the unchecked vector access is actually guaranteed to be fine.

int main()
{
    int n = read_int();

    if (n <= 0) { std::cout << "Bad container size!\n"; return 0; }

    int k = read_int();

    if (k < 0 || k >= n)  { std::cout << "Bad index!\n"; return 0; }

    std::vector<int> v(n);
    return v[k];
}

(We're ignoring the possibility that the vector construction might throw an exception of its own.)

The moral is that many operations in C++ are unsafe and only conditionally correct, but it is expected of the programmer that you make the necessary checks ahead of time. The language doesn't do it for you, and so you don't pay for it, but you have to remember to do it. The idea is that you need to handle the error conditions anyway, and so rather than enforcing an expensive, non-specific operation at the library or language level, the responsibility is left to the programmer, who is in a better position to integrate the checking into the code that needs to be written anyway.

If I wanted to be facetious, I would contrast this approach to Python, which allows you to write incredibly short and correct programs, without any user-written error handling at all. The flip side is that any attempt to use such a program that deviates only slightly from what the programmer intended leaves you with a non-specific, hard-to-read exception and stack trace and little guidance on what you should have done better. You're not forced to write any error handling, and often no error handling ends up being written. (I can't quite contrast C++ with Java, because while Java is generally safe, I have yet to see a short Java program.)</rantmode>