Test for empty string with X""

mgalgs picture mgalgs · Jul 28, 2011 · Viewed 66.4k times · Source

I know I can test for an empty string in Bash with -z like so:

if [[ -z $myvar ]]; then do_stuff; fi

but I see a lot of code written like:

if [[ X"" = X"$myvar" ]]; then do_stuff; fi

Is that method more portable? Is it just historical cruft from before the days of -z? Is it for POSIX shells (even though I've seen it used in scripts targeting bash)? Ready for my history/portability lesson.


The same question was asked on Server Fault as How to determine if a bash variable is empty? but no one offered an explanation as to why you see code with the X"" stuff.

Answer

Jonathan Leffler picture Jonathan Leffler · Jul 28, 2011

Fundamentally, because in times now long past, the behaviour of test was more complex and not uniformly defined across different systems (so portable code had to be written carefully to avoid non-portable constructs).

In particular, before test was a shell built-in, it was a separate executable (and note that MacOS X still has /bin/test and /bin/[ as executables). When that was the case, writing:

if [ -z $variable ]

when $variable was empty would invoke the test program via its alias [ with 3 arguments:

argv[0] = "["
argv[1] = "-z"
argv[2] = "]"

because the variable was empty so there was nothing to expand. So, the safe way of writing the code was:

if [ -z "$variable" ]

This works reliably, passing 4 arguments to the test executable. Granted, the test program has been a built-in to most shells for decades, but old equipment dies hard, and so do good practices learned even longer ago.

The other problem resolved by the X prefix was what happened if variables include leading dashes, or contain equals or other comparators. Consider (a not desparately good example):

x="-z"
if [ $x -eq 0 ]

Is that an empty string test with a stray (erroneous) argument, or a numeric equality test with a non-numeric first argument? Different systems provided different answers before POSIX standardized the behaviour, circa 1990. So, the safe way of dealing with this was:

if [ "X$x" = "X0" ]

or (less usually, in my experience, but completely equivalently):

if [ X"$x" = X"0" ]

It was all the edge cases like this, tied up with the possibility that the test was a separate executable, that means that portable shell code still uses double quotes more copiously than the modern shells actually require, and the X-prefix notation was used to ensure that things could not get misinterpreted.