Difference in URL decode/encode UTF-8 between Java and JS/AS3 (bug!?)

user710437 picture user710437 · May 26, 2011 · Viewed 9.6k times · Source

I am having an issue URL decoding a UTF-8 string in Java that is encoded either with Javascript or Actionscript 3. I've set up a test case as follows:

The string in question is Produktgröße

When I encode with JS/AS3 I get the following string:

escape('Produktgröße')

Produktgr%F6%DFe

When I unescape this with JS I get no change

unescape('Produktgr%F6%DFe')

Produktgr%F6%DFe

So, by this I assume that JS isn't encoding the string properly??

The following JSP produces this outupt

<%@page import="java.net.URLEncoder"%>
<%@page import="java.net.URLDecoder"%>
<%=(URLDecoder.decode("Produktgr%F6%DFe","UTF-8"))%><br/>
<%=(URLEncoder.encode("Produktgröße","UTF-8"))%><br/>
<%=(URLEncoder.encode("Produktgröße"))%><br/>
<%=(URLDecoder.decode(URLEncoder.encode("Produktgröße")))%><br/>
<%=(URLDecoder.decode(URLEncoder.encode("Produktgröße"),"UTF-8"))%><br/>

Produktgr?e

Produktgr%C3%B6%C3%9Fe

Produktgr%C3%B6%C3%9Fe

Produktgröße

Produktgröße

Any idea why I'm having this disparity with the languages and why JS/AS3 isn't behaving as I expect it to?

Thanks.

Answer

Andy E picture Andy E · May 26, 2011

escape is a deprecated function and does not correctly encode Unicode characters. Use encodeURI or encodeURIComponent, the latter probably being the method most suitable for your needs.