I recently noticed that, String.replaceAll(regex,replacement) behaves very weirdly when it comes to the escape-character "\"(slash)
For example consider there is a string with filepath - String text = "E:\\dummypath"
and we want to replace the "\\"
with "/"
.
text.replace("\\","/")
gives the output "E:/dummypath"
whereas text.replaceAll("\\","/")
raises the exception java.util.regex.PatternSyntaxException
.
If we want to implement the same functionality with replaceAll()
we need to write it as,
text.replaceAll("\\\\","/")
One notable difference is replaceAll()
has its arguments as reg-ex whereas replace()
has arguments character-sequence!
But text.replaceAll("\n","/")
works exactly the same as its char-sequence equivalent text.replace("\n","/")
Digging Deeper: Even more weird behaviors can be observed when we try some other inputs.
Lets assign text="Hello\nWorld\n"
Now,
text.replaceAll("\n","/")
, text.replaceAll("\\n","/")
, text.replaceAll("\\\n","/")
all these three gives the same output Hello/World/
Java had really messed up with the reg-ex in its best possible way I feel! No other language seems to have these playful behaviors in reg-ex. Any specific reason, why Java messed up like this?
You need to esacpe twice, once for Java, once for the regex.
Java code is
"\\\\"
makes a regex string of
"\\" - two chars
but the regex needs an escape too so it turns into
\ - one symbol