Is there a good way to detect a stale NFS mount

Chen Levy picture Chen Levy · Oct 29, 2009 · Viewed 41.2k times · Source

I have a procedure I want to initiate only if several tests complete successfully.

One test I need is that all of my NFS mounts are alive and well.

Can I do better than the brute force approach:


mount | sed -n "s/^.* on \(.*\) type nfs .*$/\1/p" | 
while read mount_point ; do 
  timeout 10 ls $mount_point >& /dev/null || echo "stale $mount_point" ; 
done

Here timeout is a utility that will run the command in the background, and will kill it after a given time, if no SIGCHLD was caught prior to the time limit, returning success/fail in the obvious way.


In English: Parse the output of mount, check (bounded by a timeout) every NFS mount point. Optionally (not in the code above) breaking on the first stale mount.

Answer

astrostl picture astrostl · Aug 24, 2011

A colleague of mine ran into your script. This doesn't avoid a "brute force" approach, but if I may in Bash:

while read _ _ mount _; do 
  read -t1 < <(stat -t "$mount") || echo "$mount timeout"; 
done < <(mount -t nfs)

mount can list NFS mounts directly. read -t (a shell builtin) can time out a command. stat -t (terse output) still hangs like an ls*. ls yields unnecessary output, risks false positives on huge/slow directory listings, and requires permissions to access - which would also trigger a false positive if it doesn't have them.

while read _ _ mount _; do 
  read -t1 < <(stat -t "$mount") || lsof -b 2>/dev/null|grep "$mount"; 
done < <(mount -t nfs)

We're using it with lsof -b (non-blocking, so it won't hang too) in order to determine the source of the hangs.

Thanks for the pointer!

  • test -d (a shell builtin) would work instead of stat (a standard external) as well, but read -t returns success only if it doesn't time out and reads a line of input. Since test -d doesn't use stdout, a (( $? > 128 )) errorlevel check on it would be necessary - not worth the legibility hit, IMO.