I would like to use gsub
to replace every occurrence of a backslash in a string with 2 backslashes.
Currently, what I have I tried is gsub("\\\\", "\\", x)
. This doesn't seem to work though. However, if I change the expression to instead replace each backslash with "a", it works fine.
> gsub("\\\\", "\\", "\\")
[1] ""
> gsub("\\\\", "a", "\\")
[1] "a"
> gsub("\\\\", "\\\\", "\\")
[1] "\\"
The last character is only a single backslash; R just prints 2 because it prints escaped characters with the backslash. Using nchar
confirms that the length is 1.
What causes this functionality? The second argument to gsub
isn't a regular expression, so having 4 backslashes in the string literal should be converted to a character with 2 backslashes. It makes even less sense that the first gsub
call above returns an empty string.
Here's what you need:
gsub("\\\\", "\\\\\\\\", "\\")
[1] "\\\\"
The reason that you need four backslashes to represent one literal backslash is that "\"
is an escape character in both R strings and for the regex engine to which you're ultimately passing your patterns. If you were talking directly to the regex engine, you'd use "\\"
to indicate a literal backslash. But in order to get R to pass "\\"
on to the regex engine, you need to type "\\\\"
.
(If you are just wanting to double backslashes, you might want to use this instead):
gsub("\\", "\\\\", "\\", fixed=TRUE)
[1] "\\\\"