flock(): removing locked file without race condition?

Arnaud Le Blanc picture Arnaud Le Blanc · Jul 17, 2013 · Viewed 23.3k times · Source

I'm using flock() for inter-process named mutexes (i.e. some process can decide to hold a lock on "some_name", which is implemented by locking a file named "some_name" in a temp directory:

lockfile = "/tmp/some_name.lock";
fd = open(lockfile, O_CREAT);
flock(fd, LOCK_EX);

do_something();

unlink(lockfile);
flock(fd, LOCK_UN);

The lock file should be removed at some point, to avoid filling the temp directory with hundreds of files.

However, there is an obvious race condition in this code; example with processes A, B and C:

A opens file
A locks file
B opens file
A unlinks file
A unlocks file
B locks file (B holds a lock on the deleted file)
C opens file (a new file one is created)
C locks file (two processes hold the same named mutex !)

Is there a way to remove the lock file at some point without introducing this race condition ?

Answer

user2769258 picture user2769258 · Sep 11, 2013

Sorry if I reply to a dead question:

After locking the file, open another copy of it, fstat both copies and check the inode number, like this:

lockfile = "/tmp/some_name.lock";

    while(1) {
        fd = open(lockfile, O_CREAT);
        flock(fd, LOCK_EX);

        fstat(fd, &st0);
        stat(lockfile, &st1);
        if(st0.st_ino == st1.st_ino) break;

        close(fd);
    }

    do_something();

    unlink(lockfile);
    flock(fd, LOCK_UN);

This prevents the race condition, because if a program holds a lock on a file that is still on the file system, every other program that has a leftover file will have a wrong inode number.

I actually proved it in the state-machine model, using the following properties:

If P_i has a descriptor locked on the filesystem then no other process is in the critical section.

If P_i is after the stat with the right inode or in the critical section it has the descriptor locked on the filesystem.