Large 2D array gives segmentation fault

kar picture kar · May 12, 2009 · Viewed 36.2k times · Source

I am writing some C++ code in Linux where I have declared a few 2D arrays like so:

 double x[5000][500], y[5000][500], z[5000][500];

During compilation there is no error. When I execute it says "segmentation fault".

Wen I reduce the size of the array from 5000 to 50, the program runs fine. How can I protect myself against this problem?

Answer

Thomas L Holaday picture Thomas L Holaday · May 12, 2009

If your program looks like this ...

int main(int, char **) {
   double x[5000][500],y[5000][500],z[5000][500];
   // ...
   return 0;
}

... then you are overflowing the stack. The fastest way to fix this is to add the word static.

int main(int, char **) {
   static double x[5000][500],y[5000][500],z[5000][500];
   // ...
   return 0;
}

The second fastest way to fix this is to move the declaration out of the function:

double x[5000][500],y[5000][500],z[5000][500];
int main(int, char **) {
   // ...
   return 0;
}

The third fastest way to fix this is to allocate the memory on the heap:

int main(int, char **) {
   double **x = new double*[5000];
   double **y = new double*[5000];
   double **z = new double*[5000];
   for (size_t i = 0; i < 5000; i++) {
      x[i] = new double[500];
      y[i] = new double[500];
      z[i] = new double[500];
   }
   // ...
   for (size_t i = 5000; i > 0; ) {
      delete[] z[--i];
      delete[] y[i];
      delete[] x[i];
   }
   delete[] z;
   delete[] y;
   delete[] x;

   return 0;
}

The fourth fastest way is to allocate them on the heap using std::vector. It is fewer lines in your file but more lines in the compilation unit, and you must either think of a meaningful name for your derived vector types or tuck them into an anonymous namespace so they won't pollute the global namespace:

#include <vector>
using std::vector
namespace { 
  struct Y : public vector<double> { Y() : vector<double>(500) {} };
  struct XY : public vector<Y> { XY() : vector<Y>(5000) {} } ;
}
int main(int, char **) {
  XY x, y, z;
  // ...
  return 0;
}

The fifth fastest way is to allocate them on the heap, but use templates so the dimensions are not so remote from the objects:

include <vector>
using namespace std;
namespace {
  template <size_t N>
  struct Y : public vector<double> { Y() : vector<double>(N) {} };
  template <size_t N1, size_t N2>
  struct XY : public vector< Y<N2> > { XY() : vector< Y<N2> > (N1) {} } ;
}
int main(int, char **) {
  XY<5000,500> x, y, z;
  XY<500,50> mini_x, mini_y, mini_z;
  // ...
  return 0;
}

The most performant way is to allocate the two-dimensional arrays as one-dimensional arrays, and then use index arithmetic.

All the above assumes that you have some reason, a good one or a poor one, for wanting to craft your own multidimensional array mechanism. If you have no reason, and expect to use multidimensional arrays again, strongly consider installing a library: