Encoding query parameters with UriComponentsBuilder

Adam Millerchip picture Adam Millerchip · Nov 10, 2017 · Viewed 8k times · Source

I am struggling to understand the behavior of UriComponentsBuilder. I want to use it to encode a URL in a query parameter, however it appears to only escape % characters, but not other necessary characters such as &.

An example of a URL in a query parameter that is not encoded at all:

UriComponentsBuilder.fromUri("http://example.com/endpoint")
                    .queryParam("query", "/path?foo=foo&bar=bar")
                    .build();

Output: http://example.com/endpoint?query=/path?foo=foo&bar=bar

This is not correct, because the unencoded & causes bar=bar to be interpreted as a query parameter to /endpoint instead of /path.

However, if I use an input that contains a % character::

UriComponentsBuilder.fromUri("http://example.com/endpoint")
                    .queryParam("query", "/path?foo=%20bar")
                    .build();

Output: http://example.com/endpoint?query=/path?foo=%2520bar

The % character is escaped.

It seems inconsistent that UriComponentsBuilder would automatically escape the % characters but not the other reserved characters.

What is the correct process to encode a URL into a query parameter with UriComponentsBuilder?

Answer

Karol Dowbecki picture Karol Dowbecki · Aug 3, 2018

In your example the build UriComponents object is not encoded or normalised. To ensure that encoding is applied:

  1. Encode it yourself by calling encode() method (see also normalize() method):

    UriComponents u = UriComponentsBuilder.fromHttpUrl("http://example.com/endpoint")
      .queryParam("query", "/path?foo=foo&bar=bar")
      .build()
      .encode(); 
    // http://example.com/endpoint?query=/path?foo%3Dfoo%26bar%3Dbar
    
  2. Use build(true) method if the parameters used for constructing the UriComponents are already encoded

    UriComponents u = UriComponentsBuilder.fromHttpUrl("http://example.com/endpoint")
      .queryParam("query", "/path?foo=foo&bar=bar")
      .build(true);
    // IllegalArgumentException: Invalid character '=' for QUERY_PARAM in "/path?foo=foo&bar=bar"
    

Under the hood HierarchicalUriComponents.encode(String) method performs the actual encoding. After few internal calls it invokes HierarchicalUriComponents.encodeBytes(byte[], HierarchicalUriComponents.Type) where HierarchicalUriComponents.Type enum controls which characters are allowed in which part of the URL. This check is based on RFC 3986. In short, Spring has it's own encoding logic for every single part of the URL.