The phrase "pass by reference" is used by C and C++ developers alike but they appear to be used to mean different things. What exactly is the difference between this equivocal phrase in each language?
There are questions that already deal with the difference between passing by reference and passing by value. In essence, passing an argument by value to a function means that the function will have its own copy of the argument - its value is copied. Modifying that copy will not modify the original object. However, when passing by reference, the parameter inside the function refers to the same object that was passed in - any changes inside the function will be seen outside.
Unfortunately, there are two ways in which the phrases "pass by value" and "pass by reference" are used which can cause confusion. I believe this is partly why pointers and references can be difficult for new C++ programmers to adopt, especially when they've come from a background in C.
In C, everything is passed by value in the technical sense. That is, whatever you give as an argument to a function, it will be copied into that function. For example, calling a function void foo(int)
with foo(x)
copies the value of x
as the parameter of foo
. This can be seen in a simple example:
void foo(int param) { param++; }
int main()
{
int x = 5;
foo(x);
printf("%d\n",x); // x == 5
}
The value of x
is copied into foo
and that copy is incremented. The x
in main
continues to have its original value.
As I'm sure you're aware, objects can be of pointer type. For example, int* p
defines p
as a pointer to an int
. It is important to note that the following code introduces two objects:
int x = 5;
int* p = &x;
The first is of type int
and has the value 5
. The second is of type int*
and its value is the address of the first object.
When passing a pointer to a function, you are still passing it by value. The address it contains is copied into the function. Modifying that pointer inside the function will not change the pointer outside the function - however, modifying the object it points to will change the object outside the function. But why?
As two pointers that have the same value always point at the same object (they contain the same address), the object that is being pointed to may be accessed and modified through both. This gives the semantics of having passed the pointed to object by reference, although no references ever actually existed - there simply are no references in C. Take a look at the changed example:
void foo(int* param) { (*param)++; }
int main()
{
int x = 5;
foo(&x);
printf("%d\n",x); // x == 6
}
We can say when passing the int*
into a function, that the int
it points to was "passed by reference" but in truth the int
was never actually passed anywhere at all - only the pointer was copied into the function. This gives us the colloquial1 meaning of "pass by value" and "pass by reference".
The usage of this terminology is backed up by terms within the standard. When you have a pointer type, the type that it is pointing to is known as its referenced type. That is, the referenced type of int*
is int
.
A pointer type may be derived from a function type, an object type, or an incomplete type, called the referenced type.
While the unary *
operator (as in *p
) is known as indirection in the standard, it is commonly also known as dereferencing a pointer. This further promotes the notion of "passing by reference" in C.
C++ adopted many of its original language features from C. Among them are pointers and so this colloquial form of "passing by reference" can still be used - *p
is still dereferencing p
. However, using the term will be confusing, because C++ introduces a feature that C doesn't have: the ability to truly pass references.
A type followed by an ampersand is a reference type2. For example, int&
is a reference to an int
. when passing an argument to a function that takes reference type, the object is truly passed by reference. There are no pointers involved, no copying of objects, no nothing. The name inside the function actually refers to exactly the same object that was passed in. To contrast with the example above:
void foo(int& param) { param++; }
int main()
{
int x = 5;
foo(x);
std::cout << x << std::endl; // x == 6
}
Now the foo
function has a parameter that is a reference to an int
. Now when passing x
, param
refers to precisely the same object. Incrementing param
has a visible change on the value of x
and now x
has the value 6.
In this example, nothing was passed by value. Nothing was copied. Unlike in C, where passing by reference was really just passing a pointer by value, in C++ we can genuinely pass by reference.
Because of this potential ambiguity in the term "pass by reference", it's best to only use it in the context of C++ when you are using a reference type. If you are passing a pointer, you are not passing by reference, you are passing a pointer by value (that is, of course, unless you are passing a reference to a pointer! e.g. int*&
). You may, however, come across uses of "pass by reference" when pointers are being used, but now at least you know what is really happening.
Other programming languages further complicate things. In some, such as Java, every variable you have is known as a reference to an object (not the same as a reference in C++, more like a pointer), but those references are passed by value. So even though you appear to be passing to a function by reference, what you're actually doing is copying a reference into the function by value. This subtle difference to passing by reference in C++ is noticed when you assign a new object to the reference passed in:
public void foo(Bar param) {
param.something();
param = new Bar();
}
If you were to call this function in Java, passing in some object of type Bar
, the call to param.something()
would be called on the same object you passed in. This is because you passed in a reference to your object. However, even though a new Bar
is assigned to param
, the object outside the function is still the same old object. The new one is never seen from the outside. That's because the reference inside foo
is being reassigned to a new object. This kind of reassigning references is impossible with C++ references.
1 By "colloquial", I don't mean to suggest that the C meaning of "pass by reference" is any less truthful than the C++ meaning, just that C++ really does have reference types and so you are genuinely passing by reference. The C meaning is an abstraction over what is really passing by value.
2 Of course, these are lvalue references and we now have rvalue references too in C++11.