URL/HTML Escaping/Encoding

Jiew Meng picture Jiew Meng · Jan 24, 2011 · Viewed 29.2k times · Source

I have always been confused with URL/HTML Encoding/Escaping. I am using PHP, so want to clear somethings up.

Can I say that I should always use

  • urlencode: for individual query string parts

    $url = 'http://test.com?param1=' . urlencode('some data') . '&param2=' . urlencode('something else');
    
  • htmlentities: for escaping special characters like <> so that if will be rendered properly by the browser

Would there be any other places I might use each function. I am not good at all these escaping stuff, always confused by them

Answer

ircmaxell picture ircmaxell · Jan 24, 2011

First off, you shouldn't be using htmlentites around 99% of the time. Instead, you should use htmlspecialchars() for escaping text for use inside xml/html documents. htmlentities are only useful for displaying characters that the native characterset you're using can't display (it is useful if your pages are in ASCII, but you have some UTF-8 characters you would like to display). Instead, just make the whole page UTF-8 (it's not hard), and be done with it.

As far as urlencode, you hit the nail on the head.

So, to recap:

  • Inside HTML:

    <b><?php echo htmlspecialchars($string, ENT_QUOTES, "UTF-8"); ?></b>
    
  • Inside of a url:

    $url = '?foo='.urlencode('bar');