Java URL encoding: URLEncoder vs. URI

John Farrelly picture John Farrelly · Jan 14, 2013 · Viewed 76.4k times · Source

Looking on the W3 Schools URL encoding webpage, it says that @ should be encoded as %40, and that space should be encoded as %20.

I've tried both URLEncoder and URI, but neither does the above properly:

import java.net.URI;
import java.net.URLEncoder;

public class Test {
    public static void main(String[] args) throws Exception {

        // Prints me%40home.com (CORRECT)
        System.out.println(URLEncoder.encode("[email protected]", "UTF-8"));

        // Prints Email+Address (WRONG: Should be Email%20Address)
        System.out.println(URLEncoder.encode("Email Address", "UTF-8"));

        // http://www.home.com/test?Email%[email protected]
        // (WRONG: it has not encoded the @ in the email address)
        URI uri = new URI("http", "www.home.com", "/test", "Email [email protected]", null);
        System.out.println(uri.toString());
    }
}

For some reason, URLEncoder does the email address correctly but not spaces, and URI does spaces currency but not email addresses.

How should I encode these 2 parameters to be consistent with what w3schools says is correct (or is w3schools wrong?)

Answer

John Farrelly picture John Farrelly · Jan 20, 2013

Although I think the answer from @fge is the right one, as I was using a 3rd party webservice that relied on the encoding outlined in the W3Schools article, I followed the answer from Java equivalent to JavaScript's encodeURIComponent that produces identical output?

public static String encodeURIComponent(String s) {
    String result;

    try {
        result = URLEncoder.encode(s, "UTF-8")
                .replaceAll("\\+", "%20")
                .replaceAll("\\%21", "!")
                .replaceAll("\\%27", "'")
                .replaceAll("\\%28", "(")
                .replaceAll("\\%29", ")")
                .replaceAll("\\%7E", "~");
    } catch (UnsupportedEncodingException e) {
        result = s;
    }

    return result;
}