Matching strings with wildcard

Robinson picture Robinson · May 18, 2015 · Viewed 100.6k times · Source

I would like to match strings with a wildcard (*), where the wildcard means "any". For example:

*X = string must end with X
X* = string must start with X
*X* = string must contain X

Also, some compound uses such as:

*X*YZ* = string contains X and contains YZ
X*YZ*P = string starts with X, contains YZ and ends with P.

Is there a simple algorithm to do this? I'm unsure about using regex (though it is a possibility).

To clarify, the users will type in the above to a filter box (as simple a filter as possible), I don't want them to have to write regular expressions themselves. So something I can easily transform from the above notation would be good.

Answer

Dmitry Bychenko picture Dmitry Bychenko · May 18, 2015

Often, wild cards operate with two type of jokers:

  ? - any character  (one and only one)
  * - any characters (zero or more)

so you can easily convert these rules into appropriate regular expression:

  // If you want to implement both "*" and "?"
  private static String WildCardToRegular(String value) {
    return "^" + Regex.Escape(value).Replace("\\?", ".").Replace("\\*", ".*") + "$"; 
  }

  // If you want to implement "*" only
  private static String WildCardToRegular(String value) {
    return "^" + Regex.Escape(value).Replace("\\*", ".*") + "$"; 
  }

And then you can use Regex as usual:

  String test = "Some Data X";

  Boolean endsWithEx = Regex.IsMatch(test, WildCardToRegular("*X"));
  Boolean startsWithS = Regex.IsMatch(test, WildCardToRegular("S*"));
  Boolean containsD = Regex.IsMatch(test, WildCardToRegular("*D*"));

  // Starts with S, ends with X, contains "me" and "a" (in that order) 
  Boolean complex = Regex.IsMatch(test, WildCardToRegular("S*me*a*X"));