I have a string with "\u00a0", and I need to replace it with "" str_replace fails

0plus1 picture 0plus1 · Apr 7, 2010 · Viewed 72.5k times · Source

I need to clean a string that comes (copy/pasted) from various Microsoft Office suite applications (Excel, Access, and Word), each with its own set of encoding.

I'm using json_encode for debugging purposes in order to being able to see every single encoded character.

I'm able to clean everything I found so far (\r \n) with str_replace, but with \u00a0 I have no luck.

$string = '[email protected]\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0;[email protected]'; //this is the output from json_encode

$clean = str_replace("\u00a0", "",$string);

returns:

[email protected]\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0;[email protected]

That is exactly the same; it completely ignores \u00a0.

Is there a way around this? Also, I'm feeling I'm reinventing the wheel, is there a function/class that completely strips EVERY possibile char of EVERY possible encoding?

____EDIT____

After the first two replies I need to clarify that my example DOES work, because it's the output from json_encode, not the actual string!

Answer

Arne picture Arne · Jul 10, 2013

By combining ord() with substr() on my string containing \u00a0, I found the following curse to work:

$text = str_replace( chr( 194 ) . chr( 160 ), ' ', $text );