Passing an array as an argument in C++

jkeys picture jkeys · Apr 18, 2009 · Viewed 54.2k times · Source

I'm writing a merge sort function, and right now I am just using a test case array (there is no input - this is static, for now). I don't know how to pass an array as an argument. Here is my code right now:

//merge sort first attempt

#include <iostream>

#include <algorithm>

#include <vector>

int mergeSort(int[]);
int main() {
    int originalarray[] = { 1, 3, 5, 7, 9, 2, 4, 6, 8, 10 };
    mergeSort(originalarray[]);
}

int mergeSort(int[] originalarray) {
    int num = (sizeof(originalarray) / sizeof(int));
    std::vector < int > original(num);

    if (num > 2) {
        return num;
    }

    // Fill the array using the elements of originalarray
    // This is just for demonstration, normally original will be a parameter,
    // so you won't be filling it up with anything.
    std::copy(originalarray, originalarray + num, original.begin());

    // Create farray and sarray of the appropriate size
    std::vector < int > farray(num / 2);
    std::vector < int > sarray(num - farray.size());

    // Fill those using elements from original
    std::copy(original.begin(), original.begin() + farray.size(), farray.begin());
    std::copy(original.begin() + farray.size(), original.end(), sarray.begin());

    mergeSort(farray);
    mergeSort(sarray);
}

Note that this mergeSort function is not functional, as I have not figured out how to merge them yet (that's my assignment). I would like to get my two vectors sorted before I deal with that, and I can't compile this because of my need to pass an array as an argument. I don't understand pointers, so if that is the solution, my excuse is ignorance. I'm learning programming right now, with C++ as a first language, and only have a basic grasp of the language's features. Thanks for the help.

Answer

Charlie Martin picture Charlie Martin · Apr 18, 2009

Jut to extend this a bit, remember that C++ arrays are exactly C arrays. So all you have is the address of a piece of memory that purports (with no guarantees) to be an array of somethings.

Update

Okay, we'll expand a little more.

C (and therefore C++) doesn't really have "arrays" as such. All it has are addresses, pointers. So when you make something an "array", what really happens is you tell the compiler that some variable represents an address.

It's useful to make a distinction in C between a declaration and a definition. In a declaration, you're simply giving something a name and a type; in a definition, you actually allocate space.

So, if we start off by definiing an array like

int ar[100];

that means we're telling the compiler we want space for 100 int's, we want it to all be allocated in one chunk, and we're going to use the name ar for it. The sizeof operator gives the number of bytes used by a type or an object, so our array ar will take up 100×sizeof(int) bytes. On most machines, that will be 400 bytes, but it varies from machine to machine.

If we define a variable

int * ar_p;   // using '_p' as a reminder this is a pointer

we're defining space for a variable that will contain an address. Its size will be sizeof(int*), which will usually be either 4 or 8, but on some machines could be anything from 2 to 16 on some machines you're unlikely to run into soon.

The name of the array is ar. The compiler converts that name into an address, so we can save that address with

ar_p = ar ;     // THIS WORKS

Now, let's say for convenience that our array ar happened to be starting at location 1000 in memory.

That name ar does not have any space allocated to it; it's like a constant, a number. So, you can't reverse that assignment

ar = ar_p ;     // THIS WON'T WORK

for the same reason you couldn't say

1000 = ar_p ;   // THIS WON'T WORK EITHER

ie, you can't change the value of 1000. (Back in early versions of FORTRAN, this trick would work, for complicated reasons. It was a mistake. You've never lived until you've tried to debug a program in which the value of "2" is 3.)

Arrays in C are always zero-based, that is, the first index is always zero. Any other indices are just addresses computed using the index. So, ar[0] is just the address 1000 plus 0 bytes of offset, or 1000. ar[1] is 1000 plus 1 times the size of an int, so the next int over. And in fact, this is always true in C.

This is called an array reference.

When we use the syntax *ar_p we're telling the compiler to get the thing AT the address contained in ar_p. `.

This is called dereferencing a pointer.

If we say

ar_p = ar;

then *ar_p and ar[0] refer to the same thing.

When we say ar[0] we're telling the compiler we want the thing at the address 0 bytes from ar. ar[1] is the address one int, or 4 bytes, from ar. So, *(ar_p+3) refers to the same thing as ar[3]. (We need the parentheses because we want to add 3 to the address first and then look at the contents. *ar_p+3 would get the contents pointed to by ap_p first, and then add 3 to those.

The thing is, C doesn't know, or much care, how big the array really is. If I come along and do ar[365], the compiler will happily generate code to look in the cell 1000+(365×sizeof(int)). If that's in your array, fine, but if it's just random memory, that's fine too. C doesn't care.

(Remember C comes from the phone company. "We don't care; we don't have to. We're the Phone Company.")

So, now, we know some rules, which I've moved down here. Read "≡" as "is equivalent to" or "is the same as".

What you can depend on:

  • foo(TYPE t[])foo(TYPE * t)

Since C doesn't know a difference between pointers and arrays, you can declare either one. When you define a function, you can write

void foo(int[] ar){

or

void foo(int* ar){

and get exactly the same effect.

  • t[i]*(t+i)

This was above. Anywhere you might write ar[i], you can replace it with *(ar+i). (There's actually a weird side case that breaks this, but you won't run into it as a beginner.)

  • where TYPE *t, (t+i) will equal the address at t plus i*sizeof(TYPE)

Explained this above as well. When you index into an array, like ar[42], it means you want the 42nd whatever over from the start address. So, if you're using int, then you need to move over 42 times however wide an int is, which is to say sizeof(int).

Now, that's all C, and since C++ is defined as a "kind of" C, it all holds for C++ as well. EXCEPT

  • unless TYPE is a user defined type that overloads operator[] and operator*.

in C++, you can decide you want to define a new type that acts just like any other type, but you can change the way the language does specific things. So, a programmer can decide to "overload" -- ie, replace -- the default behavior of the array reference and pointer dereference operators with something of their own devising. As a beginner, you shouldn't be confronted with that soon, but you should be aware of it.