Using regular expressions to find img tags without an alt attribute

awrigley picture awrigley · Oct 27, 2010 · Viewed 16.2k times · Source

I am going through a large website (1600+ pages) to make it pass Priority 1 W3C WAI. As a result, things like image tags need to have alt attributes.

What would be the regular expression for finding img tags without alt attributes? If possible, with a wee explanation so I can use to find other issues.

I am in an office with Visual Web Developer 2008. The Edit >> Find dialogue can use regular expressions.

Answer

Gruffy picture Gruffy · Jul 22, 2013

Building on Mr.Black and Roberts126 answers:

/(<img(?!.*?alt=(['"]).*?\2)[^>]*)(>)/

This will match an img tag anywhere in the code which either has no alt tag or an alt tag which is not followed by ="" or ='' (i.e. invalid alt tags).

Breaking it down:

(          : open capturing group
<img       : match the opening of an img tag
(?!        : open negative look-ahead
.*?        : lazy some or none to match any character
alt=(['"]) : match an 'alt' attribute followed by ' or " (and remember which for later)
.*?        : lazy some or none to match the value of the 'alt' attribute
\2)        : back-reference to the ' or " matched earlier
[^>]*      : match anything following the alt tag up to the closing '>' of the img tag
)          : close capturing group
(>)        : match the closing '>' of the img tag

If your code editor allows search and replace by Regex you can use this in combination with the replace string:

$1 alt=""$3

To find any alt-less img tags and append them with an empty alt tag. This is useful when using spacers or other layout images for HTML emails and the like.