Why String.replaceAll() in java requires 4 slashes "\\\\" in regex to actually replace "\"?

Bharath picture Bharath · Sep 18, 2013 · Viewed 39.6k times · Source

I recently noticed that, String.replaceAll(regex,replacement) behaves very weirdly when it comes to the escape-character "\"(slash)

For example consider there is a string with filepath - String text = "E:\\dummypath" and we want to replace the "\\" with "/".

text.replace("\\","/") gives the output "E:/dummypath" whereas text.replaceAll("\\","/") raises the exception java.util.regex.PatternSyntaxException.

If we want to implement the same functionality with replaceAll() we need to write it as, text.replaceAll("\\\\","/")

One notable difference is replaceAll() has its arguments as reg-ex whereas replace() has arguments character-sequence!

But text.replaceAll("\n","/") works exactly the same as its char-sequence equivalent text.replace("\n","/")

Digging Deeper: Even more weird behaviors can be observed when we try some other inputs.

Lets assign text="Hello\nWorld\n"

Now, text.replaceAll("\n","/"), text.replaceAll("\\n","/"), text.replaceAll("\\\n","/") all these three gives the same output Hello/World/

Java had really messed up with the reg-ex in its best possible way I feel! No other language seems to have these playful behaviors in reg-ex. Any specific reason, why Java messed up like this?

Answer

Peter Lawrey picture Peter Lawrey · Sep 18, 2013

You need to esacpe twice, once for Java, once for the regex.

Java code is

"\\\\"

makes a regex string of

"\\" - two chars

but the regex needs an escape too so it turns into

\ - one symbol