What's the difference between passing by reference vs. passing by value?

ritu picture ritu · Dec 17, 2008 · Viewed 729.9k times · Source

What is the difference between

  1. a parameter passed by reference
  2. a parameter passed by value?

Could you give me some examples, please?

Answer

Dylan Beattie picture Dylan Beattie · Jan 10, 2009

First and foremost, the "pass by value vs. pass by reference" distinction as defined in the CS theory is now obsolete because the technique originally defined as "pass by reference" has since fallen out of favor and is seldom used now.1

Newer languages2 tend to use a different (but similar) pair of techniques to achieve the same effects (see below) which is the primary source of confusion.

A secondary source of confusion is the fact that in "pass by reference", "reference" has a narrower meaning than the general term "reference" (because the phrase predates it).


Now, the authentic definition is:

  • When a parameter is passed by reference, the caller and the callee use the same variable for the parameter. If the callee modifies the parameter variable, the effect is visible to the caller's variable.

  • When a parameter is passed by value, the caller and callee have two independent variables with the same value. If the callee modifies the parameter variable, the effect is not visible to the caller.

Things to note in this definition are:

  • "Variable" here means the caller's (local or global) variable itself -- i.e. if I pass a local variable by reference and assign to it, I'll change the caller's variable itself, not e.g. whatever it is pointing to if it's a pointer.

    • This is now considered bad practice (as an implicit dependency). As such, virtually all newer languages are exclusively, or almost exclusively pass-by-value. Pass-by-reference is now chiefly used in the form of "output/inout arguments" in languages where a function cannot return more than one value.
  • The meaning of "reference" in "pass by reference". The difference with the general "reference" term is is that this "reference" is temporary and implicit. What the callee basically gets is a "variable" that is somehow "the same" as the original one. How specifically this effect is achieved is irrelevant (e.g. the language may also expose some implementation details -- addresses, pointers, dereferencing -- this is all irrelevant; if the net effect is this, it's pass-by-reference).


Now, in modern languages, variables tend to be of "reference types" (another concept invented later than "pass by reference" and inspired by it), i.e. the actual object data is stored separately somewhere (usually, on the heap), and only "references" to it are ever held in variables and passed as parameters.3

Passing such a reference falls under pass-by-value because a variable's value is technically the reference itself, not the referred object. However, the net effect on the program can be the same as either pass-by-value or pass-by-reference:

  • If a reference is just taken from a caller's variable and passed as an argument, this has the same effect as pass-by-reference: if the referred object is mutated in the callee, the caller will see the change.
    • However, if a variable holding this reference is reassiged, it will stop pointing to that object, so any further operations on this variable will instead affect whatever it is pointing to now.
  • To have the same effect as pass-by-value, a copy of the object is made at some point. Options include:
    • The caller can just make a private copy before the call and give the callee a reference to that instead.
    • In some languages, some object types are "immutable": any operation on them that seems to alter the value actually creates a completely new object without affecting the original one. So, passing an object of such a type as an argument always has the effect of pass-by-value: a copy for the callee will be made automatically if and when it needs a change, and the caller's object will never be affected.
      • In functional languages, all objects are immutable.

As you may see, this pair of techniques is almost the same as those in the definition, only with a level of indirection: just replace "variable" with "referenced object".

There's no agreed-upon name for them, which leads to contorted explanations like "call by value where the value is a reference". In 1975, Barbara Liskov suggested the term "call-by-object-sharing" (or sometimes just "call-by-sharing") though it never quite caught on. Moreover, neither of these phrases draws a parallel with the original pair. No wonder the old terms ended up being reused in the absense of anything better, leading to confusion.4


NOTE: For a long time, this answer used to say:

Say I want to share a web page with you. If I tell you the URL, I'm passing by reference. You can use that URL to see the same web page I can see. If that page is changed, we both see the changes. If you delete the URL, all you're doing is destroying your reference to that page - you're not deleting the actual page itself.

If I print out the page and give you the printout, I'm passing by value. Your page is a disconnected copy of the original. You won't see any subsequent changes, and any changes that you make (e.g. scribbling on your printout) will not show up on the original page. If you destroy the printout, you have actually destroyed your copy of the object - but the original web page remains intact.

This is mostly correct except the narrower meaning of "reference" -- it being both temporary and implicit (it doesn't have to, but being explicit and/or persistent are additional features, not a part of the pass-by-reference semantic, as explained above). A closer analogy would be giving you a copy of a document vs inviting you to work on the original.


1Unless you are programming in Fortran or Visual Basic, it's not the default behavior, and in most languages in modern use, true call-by-reference is not even possible.

2A fair amount of older ones support it, too

3In several modern languages, all types are reference types. This approach was pioneered by the language CLU in 1975 and has since been adopted by many other languages, including Python and Ruby. And many more languages use a hybrid approach, where some types are "value types" and others are "reference types" -- among them are C#, Java, and JavaScript.

4There's nothing bad with recycling a fitting old term per se, but one has to somehow make it clear which meaning is used each time. Not doing that is exactly what keeps causing confusion.