Converting a sentence string to a string array of words in Java

AnimatedRNG picture AnimatedRNG · Jan 12, 2011 · Viewed 279.8k times · Source

I need my Java program to take a string like:

"This is a sample sentence."

and turn it into a string array like:

{"this","is","a","sample","sentence"}

No periods, or punctuation (preferably). By the way, the string input is always one sentence.

Is there an easy way to do this that I'm not seeing? Or do we really have to search for spaces a lot and create new strings from the areas between the spaces (which are words)?

Answer

Adam Batkin picture Adam Batkin · Jan 12, 2011

String.split() will do most of what you want. You may then need to loop over the words to pull out any punctuation.

For example:

String s = "This is a sample sentence.";
String[] words = s.split("\\s+");
for (int i = 0; i < words.length; i++) {
    // You may want to check for a non-word character before blindly
    // performing a replacement
    // It may also be necessary to adjust the character class
    words[i] = words[i].replaceAll("[^\\w]", "");
}