Regex help NOT a-z or 0-9

s15199d picture s15199d · Jun 17, 2011 · Viewed 24.5k times · Source

I need a regex to find all chars that are NOT a-z or 0-9

I don't know the syntax for the NOT operator in regex.

I want the regex to be NOT [a-z, A-Z, 0-9].

Thanks in advance!

Answer

Michael Lowman picture Michael Lowman · Jun 17, 2011

It's ^. Your regex should use [^a-zA-Z0-9]. Beware: this character class may have unexpected behavior with non-ascii locales. For instance, this would match é.

Edited

If the regexes are perl-compatible (PCRE), you can use \s to match all whitespace. This expands to include spaces and other whitespace characters. If they're posix-compatible, use [:space:] character class (like so: [^a-zA-Z0-9[:space:]]). I would recommend using [:alnum:] instead of a-zA-Z0-9.

If you want to match the end of a line, you should include a $ at the end. Turning on multiline mode is only when your match should extend across multiple lines, and it reduces performance for larger files since more must be read into memory.

Why don't you include a copy of sample input, the text you want to match, and the program you are using to do so?