Is there a faster way to decode html characters to a string than Html.fromHtml()?

cottonBallPaws picture cottonBallPaws · Dec 1, 2010 · Viewed 14.4k times · Source

I am using Html.fromHtml(STRING).toString() to convert a string that may or may not have html and/or html entities in it, to a plain text string.

This is pretty slow, I think my last calculation was that it took about 22ms on avg. With a large batch of these it can add over a minute. So I am looking for a faster, performance built option.

Is there anyway to speed this up or are there other decoding options available?

Edit: Since there doesn't appear to be a built in method that is faster or built for performance specifically, I will reward the bounty to anyone that can point me in the direction of a library that:

  • Works well with Android
  • Licensed for free use
  • Faster than Html.fromHtml(String).toString();

As a note, I already tried Jsoup with this method: Jsoup.parse(String).text() and it was slower.

Answer

karlcow picture karlcow · Feb 3, 2011

What about org.apache.commons.lang.StringEscapeUtils's unescapeHtml(). The library is available on Apache site.

(EDIT: June 2019 - See the comments below for updates about the library)