Regular Expression to match #hashtag but not #hashtag; (with semicolon)

Wex picture Wex · Jul 21, 2016 · Viewed 27.6k times · Source

I have the current regular expression:

/(?<=[\s>]|^)#(\w*[A-Za-z_]+\w*)/g

Which I'm testing against the string:

Here's a #hashtag and here is #not_a_tag; which should be different. Also testing: Mid#hash. #123 #!@£ and <p>#hash</p>

For my purposes there should only be two hashtags detected in this string. I'm wondering how to alter the expression such that it doesn't match hashtags that end with a ; in my example this is #not_a_tag;

Cheers.

Answer

tk78 picture tk78 · Jul 21, 2016

How about the following:

\B(\#[a-zA-Z]+\b)(?!;)

Regex Demo

  • \B -> Not a word boundary
  • (#[a-zA-Z]+\b) -> Capturing Group beginning with # followed by any number of a-z or A-Z with a word boundary at the end
  • (?!;) -> Not followed by ;