Does Hive have a String split function?

user855 picture user855 · Nov 1, 2010 · Viewed 123.9k times · Source

I am looking for a in-built String split function in Hive? e.g. if String is:

A|B|C|D|E

Then I want to have a function like:

array<string> split(string input, char delimiter)

So that I get back:

[A,B,C,D,E]

Does such a in-built split function exist in Hive.

I can only see regexp_extract and regexp_replace. I would love to see a indexOf() and split() string functions.

Answer

Bkkbrad picture Bkkbrad · Nov 4, 2010

There does exist a split function based on regular expressions. It's not listed in the tutorial, but it is listed on the language manual on the wiki:

split(string str, string pat)
   Split str around pat (pat is a regular expression) 

In your case, the delimiter "|" has a special meaning as a regular expression, so it should be referred to as "\\|".