What should be done against contents of href attribute: HTML or URL encoding?
<a href="???">link text</a>
On the one hand, since href attribute contains URL I should use URL encoding. On the other hand, I'm inserting this URL into HTML, so it must be HTML encoded.
Please help me to overcome this contradiction.
Thanks.
EDIT:
Here's the contradiction. Suppose there might be the '<' and '>' characters in the URL. URL encoding won't escape them, so there will be reserved HTML characters inside the href attribute, which violates the standard. HTML encoding will escape '<' and '>' characters and HTML will be valid, but after that there will be unexpected '&' characters in the URL (this is reserved character for URL, it's used as a delimiter of query string parameters).
Reserved URL characters forms a superset of reserved HTML characters except for the '<' and '>' that are reserved for HTML but not for URL.
EDIT 2:
I was wrong about '<' and '>' characters, they are actually percent escaped by URL encoding. If so, URL encoding is sufficient in this case, isn't it?
Construct a URL as normal. Follow the rules for constructing URLs. Encode data you put into it.
Then construct HTML as normal. Follow the rules for constructing HTML. Encode data as you put it into it.
i.e. Do both (but in the right order).
They aren't mutually exclusive, so there is no contradiction.
For example (this is a simplified example that assumes data in $_GET is correct and exists, don't do that in the real world):
$search_term = $_GET['q'];
$page = $_GET['page'];
$next_page = $page + 1;
$next_page_url = 'http://example.com/search?q=' . urlencode($search_term) . '&page=' . urlencode($next_page);
$html = '<a href="' . htmlspecialchars($next_page_url) . '">link text</a>';