PHP filter_var() - FILTER_VALIDATE_URL

Alix Axel picture Alix Axel · Jan 26, 2010 · Viewed 8.6k times · Source

The FILTER_VALIDATE_URL filter seems to have some trouble validating non-ASCII URLs:

var_dump(filter_var('http://pt.wikipedia.org/wiki/', FILTER_VALIDATE_URL)); // http://pt.wikipedia.org/wiki/
var_dump(filter_var('http://pt.wikipedia.org/wiki/Guimarães', FILTER_VALIDATE_URL)); // false

Why isn't the last URL correctly validated? And what are the possible workarounds? Running PHP 5.3.0.

I'd also like to know where I can find the source code of the FILTER_VALIDATE_URL validation filter.

Answer

Rasmus picture Rasmus · Apr 1, 2010

Technically that is not a valid URL according to section 5 of RFC 1738. Browsers will automatically encode the ã character to %C3%A3 before sending the request to the server. The technically valid full url here is: http://pt.wikipedia.org/wiki/Guimar%C3%A3es Pass that to the VALIDATE_URL filter and it will work fine. The filter only validates according to spec, it doesn't try to fix/encode characters for you.