Android default charset when sending http post/put - Problems with special characters

avendael picture avendael · Jun 3, 2011 · Viewed 31.1k times · Source

I have configured the apache httpClient like so:

HttpProtocolParams.setContentCharset(httpParameters, "UTF-8");
HttpProtocolParams.setHttpElementCharset(httpParameters, "UTF-8");

I also include the http header "Content-Type: application/json; charset=UTF-8" for all http post and put requests.

I am trying to send http post/put requests with a json body that contains special characters (ie. chinese characters via the Google Pinyin keyboard, symbols, etc.) The characters appear as gibberish in the logs but I think this is because DDMS does not support UTF-8, as descibed in this issue.

The problem is when the server receives the request, it sometimes doesn't see the characters at all (especially the Chinese characters), or it becomes meaningless garbage when we retrieve it through a GET request.

I also tried putting 250 non-ascii characters in a single field because that particular field should be able to take up to 250 characters. However, it fails to validate at the server side which claims that the 250 character limit has been exceeded. 250 ASCII characters work just fine.

The server dudes claim that they support UTF-8. They even tried simulating a post request that contains Chinese characters, and the data was received by the server just fine. However, the guy (a Chinese guy) is using a Windows computer with the Chinese language pack installed (I think, because he can type Chinese characters on his keyboard).

I'm guessing that the charsets being used by the Android client and the server (made by Chinese guys btw) are not aligned. But I do not know which one is at fault since the server dudes claim that they support UTF-8, and our rest client is configured to support UTF-8.

This got me wondering on what charset Android uses by default on all text input, and if it can be changed to a different one programatically. I tried to find resources on how to do this on input widgets but I did not find anything useful.

Is there a way to set the charset for all input widgets in Android? Or maybe I missed something in the rest client configuration? Or maybe, just maybe, the server dudes are not using UTF-8 at their servers and used Windows charsets instead?

Answer

avendael picture avendael · Jun 3, 2011

Apparently, I forgot to set the StringEntity's charset to UTF-8. These lines did the trick:

    httpPut.setEntity(new StringEntity(body, HTTP.UTF_8));
    httpPost.setEntity(new StringEntity(body, HTTP.UTF_8));

So, there are at least two levels to set the charset in the Android client when sending an http post with non-ascii characters.

  1. The rest client itself itself
  2. The StringEntity

UPDATE: As Samuel pointed out in the comments, the modern way to do it is to use a ContentType, like so:

    final StringEntity se = new StringEntity(body, ContentType.APPLICATION_JSON);
    httpPut.setEntity(se);