Java String split removed empty values

Reddy picture Reddy · Jan 30, 2013 · Viewed 166k times · Source

I am trying to split the Value using a separator. But I am finding the surprising results

String data = "5|6|7||8|9||";
String[] split = data.split("\\|");
System.out.println(split.length);

I am expecting to get 8 values. [5,6,7,EMPTY,8,9,EMPTY,EMPTY] But I am getting only 6 values.

Any idea and how to fix. No matter EMPTY value comes at anyplace, it should be in array.

Answer

jlordo picture jlordo · Jan 30, 2013

split(delimiter) by default removes trailing empty strings from result array. To turn this mechanism off we need to use overloaded version of split(delimiter, limit) with limit set to negative value like

String[] split = data.split("\\|", -1);

Little more details:
split(regex) internally returns result of split(regex, 0) and in documentation of this method you can find (emphasis mine)

The limit parameter controls the number of times the pattern is applied and therefore affects the length of the resulting array.

If the limit n is greater than zero then the pattern will be applied at most n - 1 times, the array's length will be no greater than n, and the array's last entry will contain all input beyond the last matched delimiter.

If n is non-positive then the pattern will be applied as many times as possible and the array can have any length.

If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.

Exception:

It is worth mentioning that removing trailing empty string makes sense only if such empty strings ware created by split mechanism. So for "".split(anything) since we can't split "" farther we will get as result [""] array.
It happens because split didn't happen here, so "" despite being empty and trailing represents original string, not empty string which was created by splitting process.