Shallow copy for arrays, why can't simply do newArr = oldArr?

onepiece picture onepiece · Oct 31, 2012 · Viewed 7.8k times · Source

Let's say I have an array of integers, "orig"

I want to shallow copy it, so can't I just do this:

int[] shallow = orig;

My professor said that for primitives, shallow and deep copy are essentially the same, in that we have to copy over each index of the array. But setting the whole array equals to another array does the same thing, right?

I have a similar question with object arrays

This was my ideology

Book[] objArr2 = objArr1;

But I was told that I would have to copy each array index over, like

//for loop
objArr2[i] = objArr1[i];

For shallow copying is there really any difference between equaling arrays to another, and individually copy each array index? (I understand that deep means you have to create brand new objects)

Answer

Vivin Paliath picture Vivin Paliath · Oct 31, 2012

I want to shallow copy it, so can't I just do this:

int[] shallow = orig;

That's not really a shallow copy. A copy is a discrete entity that is similar to the original, but is not the original item. In your example, what you actually have are two references that are pointing to the same object. When you create a copy, you should have two resulting objects: the original and the copy.

Here, anything you do to modify shallow will happen to orig as well since they both point to the same object.

"Shallowness" comes into play when the object you are comparing has references to other objects inside it. For example, if you have an array of integers and you create a copy, you now have two arrays which both contain the same integer values:

Original Array

[0]
[1]
[2]
[3]

After copying:

[0] <--- Original  [0]
[1]                [1]
[3]                [2]
[4]      Copy ---> [3]

However, what if you had an array that consists of objects (let's say objArr1 and objArr2)? When you do a shallow copy you now have two new array objects, but each corresponding entry between the two arrays points to the same object (because the objects themselves haven't been copied; just the references have).

Original Array:

[0:]----> [object 0]
[1:]----> [object 1]
[2:]----> [object 2]
[3:]----> [object 3]

After copying (notice how the corresponding locations are pointing to the same instances):

Original -> [0:]----> [object 0] <----[:0] <- Copy
            [1:]----> [object 1] <----[:1]
            [2:]----> [object 2] <----[:2]
            [3:]----> [object 3] <----[:3]

Now if you modify objArr1 by replacing an entry or deleting an entry, that same thing doesn't happen to objArr2. However if you modify the object at objArr1[0], that is reflected in objArr2[0] as well since those locations point to the same object. So in this case, even though the container objects themselves are distinct, what they contain are references to the same object.

When you do a deep copy, you will two new arrays where each corresponding location points to different instances. So essentially you make copies of objects all the way down.

My professor said that for primitives, shallow and deep copy are essentially the same, in that we have to copy over each index of the array.

The important distinction to make is that when you copy an array of primitives, you are copying the values over exactly. Each time you get a new primitive. However, when you have an array of objects, what you really have is an array of references to objects. So when you create a copy, all you have done is create a new array that has copies of the references in the original array. However, these new copies of the references still point to the same corresponding objects. This is what's known as a shallow copy. If you deep-copied the array, then the objects that each individual location refers to, will have been copied also. So you would see something like this:

Original -> [0:]----> [object 0] Copy -> [0:]----> [copy of object 0]
            [1:]----> [object 1]         [1:]----> [copy of object 1]
            [2:]----> [object 2]         [2:]----> [copy of object 2]
            [3:]----> [object 3]         [3:]----> [copy of object 3]

But setting the whole array equals to another array does the same thing, right?

No it does not. What you're doing here is simply creating a new reference to an existing array:

arr1 -> [0, 1, 2, 3, 4]

Now let's say you did arr2 = arr1. What you have is:

arr1 -> [0, 1, 2, 3, 4] <- arr2

So here both arr1, and arr2 are pointing to the same array. So any modification you perform using arr1 will be reflected when you access the array using arr2 since you are looking at the same array. This doesn't happen when you make copies.