I am struggling to understand the behavior of UriComponentsBuilder
. I want to use it to encode a URL in a query parameter, however it appears to only escape %
characters, but not other necessary characters such as &
.
An example of a URL in a query parameter that is not encoded at all:
UriComponentsBuilder.fromUri("http://example.com/endpoint")
.queryParam("query", "/path?foo=foo&bar=bar")
.build();
Output: http://example.com/endpoint?query=/path?foo=foo&bar=bar
This is not correct, because the unencoded &
causes bar=bar
to be interpreted as a query parameter to /endpoint
instead of /path
.
However, if I use an input that contains a %
character::
UriComponentsBuilder.fromUri("http://example.com/endpoint")
.queryParam("query", "/path?foo=%20bar")
.build();
Output: http://example.com/endpoint?query=/path?foo=%2520bar
The %
character is escaped.
It seems inconsistent that UriComponentsBuilder
would automatically escape the %
characters but not the other reserved characters.
What is the correct process to encode a URL into a query parameter with UriComponentsBuilder
?
In your example the build UriComponents
object is not encoded or normalised. To ensure that encoding is applied:
Encode it yourself by calling encode()
method (see also normalize()
method):
UriComponents u = UriComponentsBuilder.fromHttpUrl("http://example.com/endpoint")
.queryParam("query", "/path?foo=foo&bar=bar")
.build()
.encode();
// http://example.com/endpoint?query=/path?foo%3Dfoo%26bar%3Dbar
Use build(true)
method if the parameters used for constructing the UriComponents
are already encoded
UriComponents u = UriComponentsBuilder.fromHttpUrl("http://example.com/endpoint")
.queryParam("query", "/path?foo=foo&bar=bar")
.build(true);
// IllegalArgumentException: Invalid character '=' for QUERY_PARAM in "/path?foo=foo&bar=bar"
Under the hood HierarchicalUriComponents.encode(String)
method performs the actual encoding. After few internal calls it invokes HierarchicalUriComponents.encodeBytes(byte[], HierarchicalUriComponents.Type)
where HierarchicalUriComponents.Type
enum controls which characters are allowed in which part of the URL. This check is based on RFC 3986. In short, Spring has it's own encoding logic for every single part of the URL.