Hive: regexp_replace square brackets

Sarah Bergquist picture Sarah Bergquist · Jan 15, 2014 · Viewed 15.2k times · Source

I need to replace the square brackets around a key:value pair similar to the following. Any help is much appreciated!

data in 'properties' looks like this:

name: property1
value: [12345667:97764458]

**code**
SELECT p.name, regexp_replace(p.value,'[','') AS value
FROM properties p

Solved: Revised code

SELECT p.name, regexp_replace(p.value,'\\[|\\]','') AS value
FROM properties p;

Answer

iggy picture iggy · Feb 5, 2014

You always need to double your backslashes in Hive regexes. That's because a single backslash is used as an escape character in Hive strings so it would get stripped off before the regex parser can see it. A double backslash becomes a single backslash, which is what you need.

To see how backslashes get stripped just run a select using your regex as a string literal:

select '\\[' from t limit 1;
OK
\[