what does 'tar --overwrite' actually do (or not do)?

4dummies picture 4dummies · Jun 26, 2018 · Viewed 10.7k times · Source

I see that Linux tar has an option --overwrite. But overwriting seems to be the default. Moreover, specifying tar --no-overwrite does not change this behavior as the info file seems to suggest.

So what does that option actually do?

I test it with

ls -l >junk
ls -l junk
tar -cf junk.tar junk
>junk
ls -l junk
tar  <option?> -xf junk.tar  # option varies, results do not
ls -l junk

Answer

K. A. Buhr picture K. A. Buhr · Jun 26, 2018

There are a few subtleties, but in general, here's the difference:

By default, "tar" tries to open output files with the flags O_CREAT | O_EXCL. If the file exists, this will fail, after which "tar" will retry by first trying to delete the existing file and then re-opening with the same flags (i.e., creating a new file).

In contrast, with the --overwrite option, "tar" tries to open output files with the flags O_CREAT | O_TRUNC. If the file exists, it will be truncated to zero size and overwritten.

The main implication is that "tar" by default will delete and re-create existing files, so they'll get new inode numbers. With --overwrite, the inode numbers won't change:

$ ls -li foo
total 0
5360222 -rw-rw-r-- 1 buhr buhr 0 Jun 26 15:16 bar
$ tar -cf foo.tar foo
$ tar -xf foo.tar  # inode will change
$ ls -li foo
total 0
5360224 -rw-rw-r-- 1 buhr buhr 0 Jun 26 15:16 bar
$ tar --overwrite -xf foo.tar  # inode won't change
$ ls -li foo
total 0
5360224 -rw-rw-r-- 1 buhr buhr 0 Jun 26 15:16 bar
$ 

This also means that, for each file overwritten, "tar" by default will need three syscalls (open, unlink, open) while --overwrite will need only one (open with truncation).