What's the best way to parse a string for "bad" words in C#?

Chaddeus picture Chaddeus · Jul 9, 2010 · Viewed 9.8k times · Source

I'm thinking of something like:

foreach (var word in paragraph.split(' ')) {
  if (badWordArray.Contains(word) {
    // do something about it
  }
}

but I'm sure there's a better way.

Thanks in advance!

UPDATE I'm not looking to remove obscenities automatically... for my web app, I want to be notified if a word I deem "bad" is used. Then I'll review it myself to make sure it's legit. An auto flagging system of sorts.

Answer

rakuo15 picture rakuo15 · Jul 9, 2010

While your way works, it may be a bit time consuming. There is a wonderful response here for a previous SO question. Though the question talks about PHP instead of C#, I think it can be easily ported.

Edit to add sample code:

public string FilterWords(string inputWords) {
    Regex wordFilter = new Regex("(puppies|kittens|dolphins|crabs)");
    return wordFilter.Replace(inputWords, "<3");
}

That should work for you, more or less.

Edit to answer OP clarification:

I'm not looking to remove obscenities automatically... for my web app, I want to be notified if a word I deem "bad" is used.

Much as the replacement portion above, you can see if something matches like so:

public bool HasBadWords(string inputWords) {
    Regex wordFilter = new Regex("(puppies|kittens|dolphins|crabs)");
    return wordFilter.IsMatch(inputWords);
}

It will return true if the string you passed to it contains any words in the list.