I have always been confused with URL/HTML Encoding/Escaping. I am using PHP, so want to clear somethings up.
Can I say that I should always use
urlencode
: for individual query string parts
$url = 'http://test.com?param1=' . urlencode('some data') . '¶m2=' . urlencode('something else');
htmlentities
: for escaping special characters like <>
so that if will be rendered properly by the browser
Would there be any other places I might use each function. I am not good at all these escaping stuff, always confused by them
First off, you shouldn't be using htmlentites
around 99% of the time. Instead, you should use htmlspecialchars()
for escaping text for use inside xml/html documents. htmlentities
are only useful for displaying characters that the native characterset you're using can't display (it is useful if your pages are in ASCII, but you have some UTF-8 characters you would like to display). Instead, just make the whole page UTF-8 (it's not hard), and be done with it.
As far as urlencode
, you hit the nail on the head.
So, to recap:
Inside HTML:
<b><?php echo htmlspecialchars($string, ENT_QUOTES, "UTF-8"); ?></b>
Inside of a url:
$url = '?foo='.urlencode('bar');