I have seen a few similar questions but I am trying to achieve this.
Given a string, str="The moon is our natural satellite, i.e. it rotates around the Earth!" I want to extract the words and store them in an array. The expected array elements would be this.
the
moon
is
our
natural
satellite
i.e.
it
rotates
around
the
earth
I tried using String.split( ','\t','\r') but this does not work correctly. I also tried removing the ., and other punctuation marks but I would want a string like "i.e." to be parsed out too. What is the best way to achieve this? I also tried using regex.split to no avail.
string[] words = Regex.Split(line, @"\W+");
Would surely appreciate some nudges in the right direction.
A regex solution.
(\b[^\s]+\b)
And if you really want to fix that last .
on i.e.
you could use this.
((\b[^\s]+\b)((?<=\.\w).)?)
Here's the code I'm using.
var input = "The moon is our natural satellite, i.e. it rotates around the Earth!";
var matches = Regex.Matches(input, @"((\b[^\s]+\b)((?<=\.\w).)?)");
foreach(var match in matches)
{
Console.WriteLine(match);
}
Results:
The moon is our natural satellite i.e. it rotates around the Earth