PostgreSQL Regex Word Boundaries?

mpen picture mpen · Sep 29, 2010 · Viewed 15k times · Source

Does PostgreSQL support \b?

I'm trying \bAB\b but it doesn't match anything, whereas (\W|^)AB(\W|$) does. These 2 expressions are essentially the same, aren't they?

Answer

Daniel Vandersluis picture Daniel Vandersluis · Sep 29, 2010

PostgreSQL uses \m, \M, \y and \Y as word boundaries:

\m   matches only at the beginning of a word
\M   matches only at the end of a word
\y   matches only at the beginning or end of a word
\Y   matches only at a point that is not the beginning or end of a word 

See Regular Expression Constraint Escapes in the manual.

There is also [[:<:]] and [[:>:]], which match the beginning and end of a word. From the manual:

There are two special cases of bracket expressions: the bracket expressions [[:<:]] and [[:>:]] are constraints, matching empty strings at the beginning and end of a word respectively. A word is defined as a sequence of word characters that is neither preceded nor followed by word characters. A word character is an alnum character (as defined by ctype) or an underscore. This is an extension, compatible with but not specified by POSIX 1003.2, and should be used with caution in software intended to be portable to other systems. The constraint escapes described below are usually preferable (they are no more standard, but are certainly easier to type).