How do I write a regular expression that excludes rather than matches, e.g., not (this|string)?

Anycorn picture Anycorn · Feb 7, 2010 · Viewed 30.6k times · Source

I am stumped trying to create an Emacs regular-expression that excludes groups. [^] excludes individual characters in a set, but I want to exclude specific sequences of characters: something like [^(not|this)], so that strings containing "not" or "this" are not matched.

In principle, I could write ([^n][^o][^t]|[^...]), but is there another way that's cleaner?

Answer

Tomalak picture Tomalak · Feb 7, 2010

This is not easily possible. Regular expressions are designed to match things, and this is all they can do.

First off: [^] does not designate an "excludes group", it designates a negated character class. Character classes do not support grouping in any form or shape. They support single characters (and, for convenience, character ranges). Your try [^(not|this)] is 100% equivalent to [^)(|hinots], as far as the regex engine is concerned.

Three ways can lead out of this situation:

  1. match (not|this) and exclude any matches with the help of the environment you are in (negate match results)
  2. use negative look-ahead, if supported by your regex engine and feasible in the situation
  3. rewrite the expression so it can match: see a similar question I asked earlier