Linux Shell Script: How to detect NFS Mount-point (or the Server) is dead?

夏期劇場 picture 夏期劇場 · Jul 12, 2013 · Viewed 23.2k times · Source

Generally on NFS Client, how to detect the Mounted-Point is no more available or DEAD from Server-end, by using the Bash Shell Script?

Normally i do:

if ls '/var/data' 2>&1 | grep 'Stale file handle';
then
   echo "failing";
else
   echo "ok";
fi

But the problem is, when especially the NFS Server is totally dead or stopped, even the, ls command, into that directory, at Client-side is hanged or died. Means, the script above is no more usable.

Is there any way to detect this again please?

Answer

Ville picture Ville · Jul 15, 2013

"stat" command is a somewhat cleaner way:

statresult=`stat /my/mountpoint 2>&1 | grep -i "stale"`
if [ "${statresult}" != "" ]; then
  #result not empty: mountpoint is stale; remove it
  umount -f /my/mountpoint
fi

Additionally, you can use rpcinfo to detect whether the remote nfs share is available:

rpcinfo -t remote.system.net nfs > /dev/null 2>&1
if [ $? -eq 0 ]; then
  echo Remote NFS share available.
fi

Added 2013-07-15T14:31:18-05:00:

I looked into this further as I am also working on a script that needs to recognize stale mountpoints. Inspired by one of the replies to "Is there a good way to detect a stale NFS mount", I think the following may be the most reliable way to check for staleness of a specific mountpoint in bash:

read -t1 < <(stat -t "/my/mountpoint")
if [ $? -eq 1 ]; then
   echo NFS mount stale. Removing... 
   umount -f -l /my/mountpoint
fi

"read -t1" construct reliably times out the subshell if stat command hangs for some reason.

Added 2013-07-17T12:03:23-05:00:

Although read -t1 < <(stat -t "/my/mountpoint") works, there doesn't seem to be a way to mute its error output when the mountpoint is stale. Adding > /dev/null 2>&1 either within the subshell, or in the end of the command line breaks it. Using a simple test: if [ -d /path/to/mountpoint ] ; then ... fi also works, and may preferable in scripts. After much testing it is what I ended up using.

Added 2013-07-19T13:51:27-05:00:

A reply to my question "How can I use read timeouts with stat?" provided additional detail about muting the output of stat (or rpcinfo) when the target is not available and the command hangs for a few minutes before it would time out on its own. While [ -d /some/mountpoint ] can be used to detect a stale mountpoint, there is no similar alternative for rpcinfo, and hence use of read -t1 redirection is the best option. The output from the subshell can be muted with 2>&-. Here is an example from CodeMonkey's response:

mountpoint="/my/mountpoint"
read -t1 < <(stat -t "$mountpoint" 2>&-)
if [[ -n "$REPLY" ]]; then
  echo "NFS mount stale. Removing..."
  umount -f -l "$mountpoint"
fi

Perhaps now this question is fully answered :).