Regex pattern for shortcodes in PHP

Luca picture Luca · Jul 5, 2012 · Viewed 10.5k times · Source

I have a problem with a regex I wrote to match shortcodes in PHP.

This is the pattern, where $shortcode is the name of the shortcode:

\[$shortcode(.+?)?\](?:(.+?)?\[\/$shortcode\])?

Now, this regex behaves pretty much fine with these formats:

  • [shortcode]
  • [shortcode=value]
  • [shortcode key=value]
  • [shortcode=value]Text[/shortcode]
  • [shortcode key1=value1 key2=value2]Text[shortcode]

But it seems to have problems with the most common format,

  • [shortcode]Text[/shortcode]

which returns as matches the following:

Array
(
    [0] => [shortcode]Text[/shortcode]
    [1] => ]Text[/shortcode
)

As you can see, the second match (which should be the text, as the first is optional) includes the end of the opening tag and all the closing tag but the last bracket.

EDIT: Found out that the match returned is the first capture, not the second. See the regex in Regexr.

Can you help with this please? I'm really crushing my head on this one.

Answer

Arnaud Le Blanc picture Arnaud Le Blanc · Jul 5, 2012

In your regex:

\[$shortcode(.+?)?\](?:(.+?)?\[\/$shortcode\])?

The first capture group (.+?) matches at least 1 character.

The whole group is optional, but in this case it happens to match every thing up to the last ].

The following regex works:

\[$shortcode(.*?)?\](?:(.+?)?\[\/$shortcode\])?

The * quantifier means 0 or more, while + means one or more.